deepr.examples.movielens.jobs package

Submodules

deepr.examples.movielens.jobs.build_records module

Build MovieLens dataset as TFRecords.

class deepr.examples.movielens.jobs.build_records.BuildRecords(path_ratings, path_mapping, path_train, path_eval, path_test, min_rating=4, min_length=5, num_negatives=8, target_ratio=0.2, size_test=10000, size_eval=10000, shuffle_timelines=True, seed=2020)[source]

Bases: Job

Build MovieLens dataset as TFRecords.

It aggregates movie ratings by user and build timelines of movies. The users are split into train / validation / test sets. Each timeline is split in two sub-timelines: one input, one target. For each item in the target, n negatives are sampled.

The resulting tfrecords have the following fields - “uid”: () - “inputPositives”: [size_input] - “targetPositives”: [size_target] - “targetNegatives”: [size_target, num_negatives]

min_length: int = 5

min_rating: int = 4

num_negatives: int = 8

path_eval: str

path_mapping: str

path_ratings: str

path_test: str

path_train: str

run()[source]: Run Job

seed: int = 2020

shuffle_timelines: bool = True

size_eval: int = 10000

size_test: int = 10000

target_ratio: float = 0.2

deepr.examples.movielens.jobs.build_records.get_timelines(path_ratings, min_rating, min_length)[source]

Build timelines from MovieLens Dataset.

Apply the following filters: keep movies with ratings > min_rating keep users with number of movies > min_length

Return type:: List[Tuple[str, List[int]]]

deepr.examples.movielens.jobs.build_records.records_generator(timelines, target_ratio, num_negatives, shuffle_timelines, mapping)[source]: Convert Timelines to list of Records with negative samples.

deepr.examples.movielens.jobs.build_records.write_records(gen, path)[source]: Write records to path.

deepr.examples.movielens.jobs.evaluate module

Evaluate MovieLens.

class deepr.examples.movielens.jobs.evaluate.Evaluate(path_predictions, path_embeddings, path_biases=None, k=50, use_mlflow=False, num_queries=1000)[source]

Bases: Job

Evaluate MovieLens using a Faiss Index.

For each user embedding, the top num_queries items are retrieved. The input items are removed from the results, then we compare the remaining top-K results to the target items.

k: Union[int, List[int]] = 50

num_queries: int = 1000

path_biases: Optional[str] = None

path_embeddings: str

path_predictions: str

run()[source]: Run Job

use_mlflow: bool = False

deepr.examples.movielens.jobs.evaluate.compute_metrics(inputs, targets, predictions, k)[source]: Compute Recall, Precision and F1.

deepr.examples.movielens.jobs.evaluate.ndcg_score(true, pred, k)[source]: Compute Normalized Discounted Cumulative Gain.

deepr.examples.movielens.jobs.evaluate.precision_recall_f1(true, pred, k)[source]: Compute precision, recall and f1_score.

deepr.examples.movielens.jobs.init module

Init Checkpoint with SVD embeddings.

class deepr.examples.movielens.jobs.init.InitCheckpoint(path_embeddings, path_init_ckpt, normalize, path_counts=None, use_log_counts=True, normalize_counts=True)[source]

Bases: Job

Init Checkpoint with SVD embeddings.

normalize: bool

normalize_counts: bool = True

path_counts: Optional[str] = None

path_embeddings: str

path_init_ckpt: str

run()[source]: Run Job

use_log_counts: bool = True

deepr.examples.movielens.jobs.predict module

Compute MovieLens predictions.

class deepr.examples.movielens.jobs.predict.Predict(path_saved_model, path_predictions, input_fn, prepro_fn)[source]

Bases: Job

Compute MovieLens predictions.

input_fn: Callable[DatasetV1]

path_predictions: str

path_saved_model: str

prepro_fn: Callable[[DatasetV1, str], tensorflow.python.data.ops.dataset_ops.DatasetV1]

run()[source]: Run Job

deepr.examples.movielens.jobs.svd module

Build MovieLens dataset as TFRecords.

class deepr.examples.movielens.jobs.svd.SVD(path_csv, path_embeddings, path_counts, vocab_size, dim=600, min_count=10)[source]

Bases: Job

Build SVD embeddings.

dim: int = 600

min_count: int = 10

path_counts: str

path_csv: str

path_embeddings: str

run()[source]: Run Job

vocab_size: int

deepr.examples.movielens.jobs.svd.compute_coocurrence(user_item, min_count)[source]: Compute co-occurrence matrix from user-item matrix.

deepr.examples.movielens.jobs.svd.compute_pmi(matrix, cds=0.75, additive_smoothing=0.0, pmi_power=1.0, k=1.0)[source]: Compute PMI matrix from item-item matrix.

deepr.examples.movielens.jobs package

Submodules

deepr.examples.movielens.jobs.build_records module

deepr.examples.movielens.jobs.evaluate module

deepr.examples.movielens.jobs.init module

deepr.examples.movielens.jobs.predict module

deepr.examples.movielens.jobs.svd module

Module contents