deepr.examples.movielens.jobs package

Submodules

deepr.examples.movielens.jobs.build_records module

Build MovieLens dataset as TFRecords.

class deepr.examples.movielens.jobs.build_records.BuildRecords(path_ratings, path_mapping, path_train, path_eval, path_test, min_rating=4, min_length=5, num_negatives=8, target_ratio=0.2, size_test=10000, size_eval=10000, shuffle_timelines=True, seed=2020)[source]

Bases: Job

Build MovieLens dataset as TFRecords.

It aggregates movie ratings by user and build timelines of movies. The users are split into train / validation / test sets. Each timeline is split in two sub-timelines: one input, one target. For each item in the target, n negatives are sampled.

The resulting tfrecords have the following fields - “uid”: () - “inputPositives”: [size_input] - “targetPositives”: [size_target] - “targetNegatives”: [size_target, num_negatives]

min_length: int = 5
min_rating: int = 4
num_negatives: int = 8
path_eval: str
path_mapping: str
path_ratings: str
path_test: str
path_train: str
run()[source]

Run Job

seed: int = 2020
shuffle_timelines: bool = True
size_eval: int = 10000
size_test: int = 10000
target_ratio: float = 0.2
deepr.examples.movielens.jobs.build_records.get_timelines(path_ratings, min_rating, min_length)[source]

Build timelines from MovieLens Dataset.

Apply the following filters

keep movies with ratings > min_rating keep users with number of movies > min_length

Return type:

List[Tuple[str, List[int]]]

deepr.examples.movielens.jobs.build_records.records_generator(timelines, target_ratio, num_negatives, shuffle_timelines, mapping)[source]

Convert Timelines to list of Records with negative samples.

deepr.examples.movielens.jobs.build_records.write_records(gen, path)[source]

Write records to path.

deepr.examples.movielens.jobs.evaluate module

Evaluate MovieLens.

class deepr.examples.movielens.jobs.evaluate.Evaluate(path_predictions, path_embeddings, path_biases=None, k=50, use_mlflow=False, num_queries=1000)[source]

Bases: Job

Evaluate MovieLens using a Faiss Index.

For each user embedding, the top num_queries items are retrieved. The input items are removed from the results, then we compare the remaining top-K results to the target items.

k: Union[int, List[int]] = 50
num_queries: int = 1000
path_biases: Optional[str] = None
path_embeddings: str
path_predictions: str
run()[source]

Run Job

use_mlflow: bool = False
deepr.examples.movielens.jobs.evaluate.compute_metrics(inputs, targets, predictions, k)[source]

Compute Recall, Precision and F1.

deepr.examples.movielens.jobs.evaluate.ndcg_score(true, pred, k)[source]

Compute Normalized Discounted Cumulative Gain.

deepr.examples.movielens.jobs.evaluate.precision_recall_f1(true, pred, k)[source]

Compute precision, recall and f1_score.

deepr.examples.movielens.jobs.init module

Init Checkpoint with SVD embeddings.

class deepr.examples.movielens.jobs.init.InitCheckpoint(path_embeddings, path_init_ckpt, normalize, path_counts=None, use_log_counts=True, normalize_counts=True)[source]

Bases: Job

Init Checkpoint with SVD embeddings.

normalize: bool
normalize_counts: bool = True
path_counts: Optional[str] = None
path_embeddings: str
path_init_ckpt: str
run()[source]

Run Job

use_log_counts: bool = True

deepr.examples.movielens.jobs.predict module

Compute MovieLens predictions.

class deepr.examples.movielens.jobs.predict.Predict(path_saved_model, path_predictions, input_fn, prepro_fn)[source]

Bases: Job

Compute MovieLens predictions.

input_fn: Callable[DatasetV1]
path_predictions: str
path_saved_model: str
prepro_fn: Callable[[DatasetV1, str], tensorflow.python.data.ops.dataset_ops.DatasetV1]
run()[source]

Run Job

deepr.examples.movielens.jobs.svd module

Build MovieLens dataset as TFRecords.

class deepr.examples.movielens.jobs.svd.SVD(path_csv, path_embeddings, path_counts, vocab_size, dim=600, min_count=10)[source]

Bases: Job

Build SVD embeddings.

dim: int = 600
min_count: int = 10
path_counts: str
path_csv: str
path_embeddings: str
run()[source]

Run Job

vocab_size: int
deepr.examples.movielens.jobs.svd.compute_coocurrence(user_item, min_count)[source]

Compute co-occurrence matrix from user-item matrix.

deepr.examples.movielens.jobs.svd.compute_pmi(matrix, cds=0.75, additive_smoothing=0.0, pmi_power=1.0, k=1.0)[source]

Compute PMI matrix from item-item matrix.

Module contents