deepr.prepros.Filter

class deepr.prepros.Filter(predicate, on_dict=True, modes=None)[source]

Filter a dataset keeping only elements on which predicate is True

A Filter instance applies a predicate to all elements of a dataset and keeps only element for which predicate returns True.

By default, elements are expected to be dictionaries. You can set on_dict=False if your dataset does not yield dictionaries.

Because some preprocessing pipelines behave differently depending on the mode (TRAIN, EVAL, PREDICT), an optional argument can be provided. By setting modes, you select the modes on which the map transformation should apply. For example:

>>> from deepr import readers
>>> from deepr.prepros import Filter
>>> def gen():
...     yield {"a": 0}
...     yield {"a": 1}
>>> raw_dataset = tf.data.Dataset.from_generator(gen, {"a": tf.int32}, {"a": tf.TensorShape([])})
>>> list(readers.from_dataset(raw_dataset))
[{'a': 0}, {'a': 1}]
>>> def predicate(x):
...     return {"b": tf.equal(x["a"], 0)}
>>> prepro_fn = Filter(predicate, modes=[tf.estimator.ModeKeys.TRAIN])
>>> raw_dataset = tf.data.Dataset.from_generator(gen, {"a": tf.int32}, {"a": tf.TensorShape([])})
>>> dataset = prepro_fn(raw_dataset, tf.estimator.ModeKeys.TRAIN)
>>> list(readers.from_dataset(dataset))
[{'a': 0}]
>>> dataset = prepro_fn(raw_dataset, tf.estimator.ModeKeys.PREDICT)
>>> list(readers.from_dataset(dataset))
[{'a': 0}, {'a': 1}]

If the mode is not given at runtime, the preprocessing is applied.

>>> dataset = prepro_fn(raw_dataset)
>>> list(readers.from_dataset(dataset))
[{'a': 0}]
predicate

Predicate function, returns either a tf.bool or a dictionary with one key.

Type:

Callable

on_dict

If True (default), assumes dataset yields dictionaries

Type:

bool, Optional

modes

Active modes for the map (will skip modes not in modes). Default is None (all modes are considered active modes).

Type:

Iterable[str], Optional

__init__(predicate, on_dict=True, modes=None)[source]

Methods

__init__(predicate[, on_dict, modes])

apply(dataset[, mode])

Pre-process a dataset

Attributes

tf_predicate

Return final predicate function.