[ ]:
!pip install deepr[cpu]
Quickstart
This notebook is a gentle introduction to the few concepts and abstractions of deepr.
It demonstrates how to train a model that learns how to multiply a number by 2.
To train a model with deepr the main entry point is the Trainer job.
It is important at this point to stress that deepr
is not yet another library to build neural networks, but merely a utility to build functions that operate on basic Tensorflow types, i.e. tf.Tensor and tf.data.Dataset.
Using functional programming makes it easy to lazily define graphs that will only be built at run time by the tf.estimator high-level API.
The Trainer
job uses most of the important concepts of deepr, while only expecting basic types (mainly functions operating on datasets, dictionaries of tensors, etc.).
path_model : str
Path to the model directory. Can be either local or HDFS.pred_fn : Callable[[Dict[str, tf.Tensor], str], Dict[str, tf.Tensor]]
Typically a Layer instance, but in general, any callable.loss_fn : Callable[[Dict[str, tf.Tensor], str], Dict[str, tf.Tensor]]
Typically a Layer instance, but in general, any callable.optimizer_fn : Callable[[tf.Tensor], tf.Tensor]
Typically an Optimizer instance, but in general, any callable.train_input_fn : Callable[[], tf.data.Dataset]
Typically a Reader instance, but in general, any callable.eval_input_fn : Callable[[], tf.data.Dataset]
Typically a Reader instance, but in general, any callable.prepro_fn: Callable[[tf.data.Dataset, str], tf.data.Dataset], Optional
Typically a Prepro instance, but in general, any callable.
There are more parameters that use the other concepts (hooks, metrics, exporter, …) and this will be covered in another guide.
So to train our model, we need to define all that, let’s start !
Dataset
The first step is to build a dataset. For this we will build a synthetic dataset of numbers of (x, 2x).
Also see other ways to build a dataset in the reader reference
Some imports first
[1]:
import logging
import sys
logging.basicConfig(level=logging.INFO, stream=sys.stdout)
logging.getLogger("tensorflow").setLevel(logging.CRITICAL)
[2]:
import tensorflow as tf
import deepr
import numpy as np
[3]:
if deepr.io.Path("model").is_dir():
deepr.io.Path("model").delete_dir()
Let’s define a generator function and then use a GeneratorReader to create a tf.data.Dataset
[4]:
def generator_fn():
for _ in range(1000):
x = np.random.random()
yield {"x": x, "y": 2 * x}
reader = deepr.readers.GeneratorReader(
generator_fn,
output_types={"x":tf.float32, "y":tf.float32},
output_shapes={"x":(), "y":()}
)
The Reader
classes are simple helper functions to create tf.data.Dataset
, heavily inspired by the tensorflow_dataset
package.
Once the reader is configured, you can create a new Dataset
with
[5]:
dataset = reader.as_dataset()
print(dataset)
dataset = reader() # Simply an alias for as_dataset
print(dataset)
<DatasetV1Adapter shapes: {x: (), y: ()}, types: {x: tf.float32, y: tf.float32}>
<DatasetV1Adapter shapes: {x: (), y: ()}, types: {x: tf.float32, y: tf.float32}>
Iterating over a tf.data.Dataset
in “graph” mode is not possible.
The base Reader
class makes it possible to iterate over the dataset, faking eager-execution mode (under the hood it simply creates a session in the special __iter__
method).
Let’s have a look at the content of our dataset
[6]:
for index, item in enumerate(reader):
print(item)
if index == 10:
break
{'x': 0.026452266, 'y': 0.05290453}
{'x': 0.22739638, 'y': 0.45479277}
{'x': 0.5724985, 'y': 1.144997}
{'x': 0.403317, 'y': 0.806634}
{'x': 0.21341616, 'y': 0.42683232}
{'x': 0.83121186, 'y': 1.6624237}
{'x': 0.3990266, 'y': 0.7980532}
{'x': 0.7587566, 'y': 1.5175132}
{'x': 0.24581175, 'y': 0.4916235}
{'x': 0.40846375, 'y': 0.8169275}
{'x': 0.27732524, 'y': 0.5546505}
The Trainer
job expects 2 input_fn
that are simple callables creating new tf.data.Dataset
.
Our reader
does exactly that, so let’s set
[7]:
train_input_fn = reader
eval_input_fn = reader
Prepro
Now that we have datasets, we need to preprocess them before feeding data to our model. In this example, we only need to create batches of data, and allow multiple iterations over the dataset to be able to perform multiple epochs.
Let’s use the prepro
module to functionally define a preprocessing function.
See the prepro reference
[8]:
prepro_fn = deepr.prepros.Serial(
deepr.prepros.Batch(batch_size=32),
deepr.prepros.Repeat(10, modes=[tf.estimator.ModeKeys.TRAIN])
)
As expected, the output of this prepro function is a batched dataset
[9]:
prepro_fn(reader())
[9]:
<DatasetV1Adapter shapes: {x: (?,), y: (?,)}, types: {x: tf.float32, y: tf.float32}>
Let’s check the result of our preprocessing by iterating over the dataset. We use the helper function from_dataset
that creates a reader
from any tf.data.Dataset
, which gives us eager-like iteration over the underlying dataset.
[10]:
for item in deepr.readers.base.from_dataset(prepro_fn(reader())):
print(item)
break
{'x': array([8.38533640e-01, 5.11415541e-01, 4.91451062e-02, 6.71378195e-01,
7.19314665e-02, 2.07208991e-01, 6.07405782e-01, 2.14489564e-01,
1.24138966e-01, 5.16671121e-01, 2.33591374e-04, 3.69159013e-01,
3.05574089e-01, 9.81275201e-01, 4.54333931e-01, 3.23030204e-01,
6.02127731e-01, 2.13016793e-01, 8.41403484e-01, 6.13585055e-01,
1.33147994e-02, 6.54389381e-01, 8.09324920e-01, 5.17527759e-01,
2.62713879e-01, 2.71976054e-01, 4.55039740e-01, 2.46606708e-01,
8.55176270e-01, 2.10764825e-01, 9.98475403e-02, 1.92478955e-01],
dtype=float32), 'y': array([1.6770673e+00, 1.0228311e+00, 9.8290212e-02, 1.3427564e+00,
1.4386293e-01, 4.1441798e-01, 1.2148116e+00, 4.2897913e-01,
2.4827793e-01, 1.0333422e+00, 4.6718275e-04, 7.3831803e-01,
6.1114818e-01, 1.9625504e+00, 9.0866786e-01, 6.4606041e-01,
1.2042555e+00, 4.2603359e-01, 1.6828070e+00, 1.2271701e+00,
2.6629599e-02, 1.3087788e+00, 1.6186498e+00, 1.0350555e+00,
5.2542776e-01, 5.4395211e-01, 9.1007948e-01, 4.9321342e-01,
1.7103525e+00, 4.2152965e-01, 1.9969508e-01, 3.8495791e-01],
dtype=float32)}
Model
Now that we have a preprocessed dataset, let’s build the model.
The dataset yields dictionaries of tensors.
The model is made of 2 main components
pred_fn(tensors: Dict, mode) -> Dict
operates on the dataset dictionaries, creates new tensors (the predictions).loss_fn(tensors: Dict, mode) -> Dict
operates on the dataset andpred_fn
results, creates at least one new tensorloss
.
We’re going to use the layer
module to quickly define those functions.
Make sure to check the layer reference for more information.
Pred function
The first part of the model is the prediction function.
Here it’s pretty simple : it will predict a y_pred
with an alpha
parameter such that y_pred = alpha * x
We first define this as a Multiply
layer :
[11]:
@deepr.layers.layer(n_in=1, n_out=1)
def Multiply(tensors):
alpha = tf.get_variable(name="alpha", shape=(), dtype=tf.float32)
return alpha * tensors
The layer
decorator creates a Layer
class from the function, roughly equivalent to
class Multiply:
def __init__(self, n_in=1, n_out=1, inputs=None, outputs=None, name=None):
self.n_in = n_in
self.n_out = n_out
self.inputs = inputs
self.outputs = outputs
self.name = name
def __call__(self, tensors, mode: str):
if isinstance(tensors, dict):
return self.forward_as_dict(tensors, mode)
else:
return self.forward(tensors, mode)
def forward(self, tensors, mode: str):
alpha = tf.get_variable(name="alpha", shape=(), dtype=tf.float32)
return alpha * tensors
def forward_as_dict(self, tensors: Dict, mode: str) -> Dict:
return {self.outputs: self.forward(tensors[self.inputs])}
We can instantiate our Layer
with
[12]:
pred_fn = Multiply(inputs="x", outputs="y_pred")
The power of the base Layer class is that layers are actually functions that can operate on both dictionaries and tuples of tensors.
The inputs
and outputs
arguments, when given, specify the keys of the dictionaries to use for the layer.
Let’s see how it works
[13]:
tf.reset_default_graph()
print(pred_fn(tf.constant(1.0)))
tf.reset_default_graph() # Remove alpha variable from the graph
print(pred_fn({"x": tf.constant(1.0)}))
Tensor("mul:0", shape=(), dtype=float32)
{'y_pred': <tf.Tensor 'mul:0' shape=() dtype=float32>}
Let’s check the output of this model (alpha is initialized randomly) :
[14]:
tf.reset_default_graph()
y_pred = pred_fn(tf.constant(1.0))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(y_pred))
1.6913518
Loss function
Let’s then define the loss function. A squared l2 loss will work fine here, let’s create a layer for this :
[15]:
@deepr.layers.layer(n_in=2, n_out=1)
def SquaredL2(tensors):
x, y = tensors
return tf.reduce_sum((x-y)**2)
[16]:
loss_fn = SquaredL2(inputs=("y_pred", "y"), outputs="loss")
Let’s see if it works :
[17]:
with tf.Session() as sess:
print(sess.run(loss_fn((tf.constant(1.0), tf.constant(0.5)))))
print(sess.run(loss_fn({"y_pred": tf.constant(1.0), "y": tf.constant(0.5)})))
0.25
{'loss': 0.25}
Optimizer
The last thing we need is the optimizer. See the optimizer reference
[18]:
optimizer_fn = deepr.optimizers.TensorflowOptimizer("Adam", 0.1)
Trainer job
Since all these concepts are now defined, let’s create a Trainer
job.
Make sure to check the trainer reference
[19]:
job = deepr.jobs.Trainer(
path_model="model",
pred_fn=pred_fn,
loss_fn=loss_fn,
optimizer_fn=optimizer_fn,
train_input_fn=train_input_fn,
eval_input_fn=eval_input_fn,
prepro_fn=prepro_fn
)
Creating the job is lazy and doesn’t take any time. To run it, call the run method :
[20]:
job.run()
INFO:deepr.prepros.core:Not applying Repeat(10) (mode=eval)
INFO:deepr.jobs.trainer:Running final evaluation, using global_step = 320
INFO:deepr.prepros.core:Not applying Repeat(10) (mode=eval)
INFO:deepr.jobs.trainer:{'loss': 2.5611915e-12, 'global_step': 320}
The loss is 0, great, we now know how to multiply by 2 :)
Let’s check alpha is indeed equal to 2 :
[21]:
experiment = job.create_experiment()
estimator = experiment.estimator
print(estimator.get_variable_value("alpha"))
2.0000005