[ ]:
!pip install deepr[cpu]

Quickstart

This notebook is a gentle introduction to the few concepts and abstractions of deepr.

It demonstrates how to train a model that learns how to multiply a number by 2.

To train a model with deepr the main entry point is the Trainer job.

It is important at this point to stress that deepr is not yet another library to build neural networks, but merely a utility to build functions that operate on basic Tensorflow types, i.e. tf.Tensor and tf.data.Dataset.

Using functional programming makes it easy to lazily define graphs that will only be built at run time by the tf.estimator high-level API.

The Trainer job uses most of the important concepts of deepr, while only expecting basic types (mainly functions operating on datasets, dictionaries of tensors, etc.).

  • path_model : str Path to the model directory. Can be either local or HDFS.

  • pred_fn : Callable[[Dict[str, tf.Tensor], str], Dict[str, tf.Tensor]] Typically a Layer instance, but in general, any callable.

  • loss_fn : Callable[[Dict[str, tf.Tensor], str], Dict[str, tf.Tensor]] Typically a Layer instance, but in general, any callable.

  • optimizer_fn : Callable[[tf.Tensor], tf.Tensor] Typically an Optimizer instance, but in general, any callable.

  • train_input_fn : Callable[[], tf.data.Dataset] Typically a Reader instance, but in general, any callable.

  • eval_input_fn : Callable[[], tf.data.Dataset] Typically a Reader instance, but in general, any callable.

  • prepro_fn: Callable[[tf.data.Dataset, str], tf.data.Dataset], Optional Typically a Prepro instance, but in general, any callable.

There are more parameters that use the other concepts (hooks, metrics, exporter, …) and this will be covered in another guide.

So to train our model, we need to define all that, let’s start !

Dataset

The first step is to build a dataset. For this we will build a synthetic dataset of numbers of (x, 2x).

Also see other ways to build a dataset in the reader reference

Some imports first

[1]:
import logging
import sys
logging.basicConfig(level=logging.INFO, stream=sys.stdout)
logging.getLogger("tensorflow").setLevel(logging.CRITICAL)
[2]:
import tensorflow as tf
import deepr
import numpy as np
[3]:
if deepr.io.Path("model").is_dir():
    deepr.io.Path("model").delete_dir()

Let’s define a generator function and then use a GeneratorReader to create a tf.data.Dataset

[4]:
def generator_fn():
    for _ in range(1000):
        x = np.random.random()
        yield {"x": x, "y": 2 * x}

reader = deepr.readers.GeneratorReader(
    generator_fn,
    output_types={"x":tf.float32, "y":tf.float32},
    output_shapes={"x":(), "y":()}
)

The Reader classes are simple helper functions to create tf.data.Dataset, heavily inspired by the tensorflow_dataset package.

Once the reader is configured, you can create a new Dataset with

[5]:
dataset = reader.as_dataset()
print(dataset)
dataset = reader()  # Simply an alias for as_dataset
print(dataset)
<DatasetV1Adapter shapes: {x: (), y: ()}, types: {x: tf.float32, y: tf.float32}>
<DatasetV1Adapter shapes: {x: (), y: ()}, types: {x: tf.float32, y: tf.float32}>

Iterating over a tf.data.Dataset in “graph” mode is not possible.

The base Reader class makes it possible to iterate over the dataset, faking eager-execution mode (under the hood it simply creates a session in the special __iter__ method).

Let’s have a look at the content of our dataset

[6]:
for index, item in enumerate(reader):
    print(item)
    if index == 10:
        break
{'x': 0.026452266, 'y': 0.05290453}
{'x': 0.22739638, 'y': 0.45479277}
{'x': 0.5724985, 'y': 1.144997}
{'x': 0.403317, 'y': 0.806634}
{'x': 0.21341616, 'y': 0.42683232}
{'x': 0.83121186, 'y': 1.6624237}
{'x': 0.3990266, 'y': 0.7980532}
{'x': 0.7587566, 'y': 1.5175132}
{'x': 0.24581175, 'y': 0.4916235}
{'x': 0.40846375, 'y': 0.8169275}
{'x': 0.27732524, 'y': 0.5546505}

The Trainer job expects 2 input_fn that are simple callables creating new tf.data.Dataset.

Our reader does exactly that, so let’s set

[7]:
train_input_fn = reader
eval_input_fn = reader

Prepro

Now that we have datasets, we need to preprocess them before feeding data to our model. In this example, we only need to create batches of data, and allow multiple iterations over the dataset to be able to perform multiple epochs.

Let’s use the prepro module to functionally define a preprocessing function.

See the prepro reference

[8]:
prepro_fn = deepr.prepros.Serial(
    deepr.prepros.Batch(batch_size=32),
    deepr.prepros.Repeat(10, modes=[tf.estimator.ModeKeys.TRAIN])
)

As expected, the output of this prepro function is a batched dataset

[9]:
prepro_fn(reader())
[9]:
<DatasetV1Adapter shapes: {x: (?,), y: (?,)}, types: {x: tf.float32, y: tf.float32}>

Let’s check the result of our preprocessing by iterating over the dataset. We use the helper function from_dataset that creates a reader from any tf.data.Dataset, which gives us eager-like iteration over the underlying dataset.

[10]:
for item in deepr.readers.base.from_dataset(prepro_fn(reader())):
    print(item)
    break
{'x': array([8.38533640e-01, 5.11415541e-01, 4.91451062e-02, 6.71378195e-01,
       7.19314665e-02, 2.07208991e-01, 6.07405782e-01, 2.14489564e-01,
       1.24138966e-01, 5.16671121e-01, 2.33591374e-04, 3.69159013e-01,
       3.05574089e-01, 9.81275201e-01, 4.54333931e-01, 3.23030204e-01,
       6.02127731e-01, 2.13016793e-01, 8.41403484e-01, 6.13585055e-01,
       1.33147994e-02, 6.54389381e-01, 8.09324920e-01, 5.17527759e-01,
       2.62713879e-01, 2.71976054e-01, 4.55039740e-01, 2.46606708e-01,
       8.55176270e-01, 2.10764825e-01, 9.98475403e-02, 1.92478955e-01],
      dtype=float32), 'y': array([1.6770673e+00, 1.0228311e+00, 9.8290212e-02, 1.3427564e+00,
       1.4386293e-01, 4.1441798e-01, 1.2148116e+00, 4.2897913e-01,
       2.4827793e-01, 1.0333422e+00, 4.6718275e-04, 7.3831803e-01,
       6.1114818e-01, 1.9625504e+00, 9.0866786e-01, 6.4606041e-01,
       1.2042555e+00, 4.2603359e-01, 1.6828070e+00, 1.2271701e+00,
       2.6629599e-02, 1.3087788e+00, 1.6186498e+00, 1.0350555e+00,
       5.2542776e-01, 5.4395211e-01, 9.1007948e-01, 4.9321342e-01,
       1.7103525e+00, 4.2152965e-01, 1.9969508e-01, 3.8495791e-01],
      dtype=float32)}

Model

Now that we have a preprocessed dataset, let’s build the model.

The dataset yields dictionaries of tensors.

The model is made of 2 main components

  1. pred_fn(tensors: Dict, mode) -> Dict operates on the dataset dictionaries, creates new tensors (the predictions).

  2. loss_fn(tensors: Dict, mode) -> Dict operates on the dataset and pred_fn results, creates at least one new tensor loss.

We’re going to use the layer module to quickly define those functions.

Make sure to check the layer reference for more information.

Pred function

The first part of the model is the prediction function.

Here it’s pretty simple : it will predict a y_pred with an alpha parameter such that y_pred = alpha * x

We first define this as a Multiply layer :

[11]:
@deepr.layers.layer(n_in=1, n_out=1)
def Multiply(tensors):
    alpha = tf.get_variable(name="alpha", shape=(), dtype=tf.float32)
    return alpha * tensors

The layer decorator creates a Layer class from the function, roughly equivalent to

class Multiply:

    def __init__(self, n_in=1, n_out=1, inputs=None, outputs=None, name=None):
        self.n_in = n_in
        self.n_out = n_out
        self.inputs = inputs
        self.outputs = outputs
        self.name = name

    def __call__(self, tensors, mode: str):
        if isinstance(tensors, dict):
            return self.forward_as_dict(tensors, mode)
        else:
            return self.forward(tensors, mode)

    def forward(self, tensors, mode: str):
        alpha = tf.get_variable(name="alpha", shape=(), dtype=tf.float32)
        return alpha * tensors

    def forward_as_dict(self, tensors: Dict, mode: str) -> Dict:
        return {self.outputs: self.forward(tensors[self.inputs])}

We can instantiate our Layer with

[12]:
pred_fn = Multiply(inputs="x", outputs="y_pred")

The power of the base Layer class is that layers are actually functions that can operate on both dictionaries and tuples of tensors.

The inputs and outputs arguments, when given, specify the keys of the dictionaries to use for the layer.

Let’s see how it works

[13]:
tf.reset_default_graph()
print(pred_fn(tf.constant(1.0)))
tf.reset_default_graph()  # Remove alpha variable from the graph
print(pred_fn({"x": tf.constant(1.0)}))
Tensor("mul:0", shape=(), dtype=float32)
{'y_pred': <tf.Tensor 'mul:0' shape=() dtype=float32>}

Let’s check the output of this model (alpha is initialized randomly) :

[14]:
tf.reset_default_graph()
y_pred = pred_fn(tf.constant(1.0))
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(y_pred))
1.6913518

Loss function

Let’s then define the loss function. A squared l2 loss will work fine here, let’s create a layer for this :

[15]:
@deepr.layers.layer(n_in=2, n_out=1)
def SquaredL2(tensors):
    x, y = tensors
    return tf.reduce_sum((x-y)**2)
[16]:
loss_fn = SquaredL2(inputs=("y_pred", "y"), outputs="loss")

Let’s see if it works :

[17]:
with tf.Session() as sess:
    print(sess.run(loss_fn((tf.constant(1.0), tf.constant(0.5)))))
    print(sess.run(loss_fn({"y_pred": tf.constant(1.0), "y": tf.constant(0.5)})))
0.25
{'loss': 0.25}

Optimizer

The last thing we need is the optimizer. See the optimizer reference

[18]:
optimizer_fn = deepr.optimizers.TensorflowOptimizer("Adam", 0.1)

Trainer job

Since all these concepts are now defined, let’s create a Trainer job.

Make sure to check the trainer reference

[19]:
job = deepr.jobs.Trainer(
    path_model="model",
    pred_fn=pred_fn,
    loss_fn=loss_fn,
    optimizer_fn=optimizer_fn,
    train_input_fn=train_input_fn,
    eval_input_fn=eval_input_fn,
    prepro_fn=prepro_fn
)

Creating the job is lazy and doesn’t take any time. To run it, call the run method :

[20]:
job.run()
INFO:deepr.prepros.core:Not applying Repeat(10) (mode=eval)
INFO:deepr.jobs.trainer:Running final evaluation, using global_step = 320
INFO:deepr.prepros.core:Not applying Repeat(10) (mode=eval)
INFO:deepr.jobs.trainer:{'loss': 2.5611915e-12, 'global_step': 320}

The loss is 0, great, we now know how to multiply by 2 :)

Let’s check alpha is indeed equal to 2 :

[21]:
experiment = job.create_experiment()
estimator = experiment.estimator
print(estimator.get_variable_value("alpha"))
2.0000005