{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install deepr[cpu]" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "# Quickstart" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook is a gentle introduction to the few concepts and abstractions of deepr.\n", "\n", "It demonstrates how to train a model that learns how to multiply a number by 2.\n", "\n", "To train a model with deepr the main entry point is the [Trainer](https://criteo.github.io/deepr/API/_autosummary/deepr.jobs.Trainer.html#deepr.jobs.Trainer) job.\n", "\n", "It is important at this point to stress that `deepr` is not yet another library to build neural networks, but merely a utility to build functions that operate on basic Tensorflow types, i.e. [tf.Tensor](https://www.tensorflow.org/api_docs/python/tf/Tensor) and [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset).\n", "\n", "Using functional programming makes it easy to lazily define graphs that will only be built at run time by the [tf.estimator](https://www.tensorflow.org/guide/estimator) high-level API.\n", "\n", "The `Trainer` job uses most of the [important concepts](https://criteo.github.io/deepr/API/core.html) of deepr, while only expecting basic types (mainly functions operating on datasets, dictionaries of tensors, etc.).\n", "\n", "\n", "* `path_model : str`\n", " Path to the model directory. Can be either local or HDFS.\n", " \n", "* `pred_fn : Callable[[Dict[str, tf.Tensor], str], Dict[str, tf.Tensor]]`\n", " Typically a [Layer](https://criteo.github.io/deepr/API/_autosummary/deepr.layers.Layer.html#deepr.layers.Layer) instance, but in general, any callable.\n", "\n", "* `loss_fn : Callable[[Dict[str, tf.Tensor], str], Dict[str, tf.Tensor]]`\n", " Typically a [Layer](https://criteo.github.io/deepr/API/_autosummary/deepr.layers.Layer.html#deepr.layers.Layer) instance, but in general, any callable.\n", "\n", "* `optimizer_fn : Callable[[tf.Tensor], tf.Tensor]`\n", " Typically an [Optimizer](https://criteo.github.io/deepr/API/_autosummary/deepr.optimizers.Optimizer.html#deepr.optimizers.Optimizer) instance, but in general, any callable.\n", "\n", "* `train_input_fn : Callable[[], tf.data.Dataset]`\n", " Typically a [Reader](https://criteo.github.io/deepr/API/_autosummary/deepr.readers.Reader.html#deepr.readers.Reader) instance, but in general, any callable.\n", "\n", "* `eval_input_fn : Callable[[], tf.data.Dataset]`\n", " Typically a [Reader](https://criteo.github.io/deepr/API/_autosummary/deepr.readers.Reader.html#deepr.readers.Reader) instance, but in general, any callable.\n", "\n", "* `prepro_fn: Callable[[tf.data.Dataset, str], tf.data.Dataset], Optional`\n", " Typically a [Prepro](https://criteo.github.io/deepr/API/_autosummary/deepr.prepros.Prepro.html#deepr.prepros.Prepro) instance, but in general, any callable.\n", "\n", "There are more parameters that use the other concepts (hooks, metrics, exporter, ...) and this will be covered in another guide.\n", "\n", "So to train our model, we need to define all that, let's start !" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first step is to build a dataset. For this we will build a synthetic dataset of numbers of (x, 2x).\n", "\n", "Also see other ways to build a dataset in the [reader reference](https://criteo.github.io/deepr/API/core.html#reader)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some imports first" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import logging\n", "import sys\n", "logging.basicConfig(level=logging.INFO, stream=sys.stdout)\n", "logging.getLogger(\"tensorflow\").setLevel(logging.CRITICAL)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import tensorflow as tf\n", "import deepr\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "if deepr.io.Path(\"model\").is_dir():\n", " deepr.io.Path(\"model\").delete_dir()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's define a generator function and then use a [GeneratorReader](https://criteo.github.io/deepr/API/_autosummary/deepr.readers.GeneratorReader.html#deepr.readers.GeneratorReader) to create a `tf.data.Dataset`" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "def generator_fn():\n", " for _ in range(1000):\n", " x = np.random.random()\n", " yield {\"x\": x, \"y\": 2 * x}\n", "\n", "reader = deepr.readers.GeneratorReader(\n", " generator_fn,\n", " output_types={\"x\":tf.float32, \"y\":tf.float32},\n", " output_shapes={\"x\":(), \"y\":()}\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `Reader` classes are simple helper functions to create `tf.data.Dataset`, heavily inspired by the `tensorflow_dataset` package.\n", "\n", "Once the reader is configured, you can create a new `Dataset` with" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n" ] } ], "source": [ "dataset = reader.as_dataset()\n", "print(dataset)\n", "dataset = reader() # Simply an alias for as_dataset\n", "print(dataset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Iterating over a `tf.data.Dataset` in \"graph\" mode is not possible.\n", "\n", "The base `Reader` class makes it possible to iterate over the dataset, faking eager-execution mode (under the hood it simply creates a session in the special `__iter__` method).\n", "\n", "Let's have a look at the content of our dataset" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'x': 0.026452266, 'y': 0.05290453}\n", "{'x': 0.22739638, 'y': 0.45479277}\n", "{'x': 0.5724985, 'y': 1.144997}\n", "{'x': 0.403317, 'y': 0.806634}\n", "{'x': 0.21341616, 'y': 0.42683232}\n", "{'x': 0.83121186, 'y': 1.6624237}\n", "{'x': 0.3990266, 'y': 0.7980532}\n", "{'x': 0.7587566, 'y': 1.5175132}\n", "{'x': 0.24581175, 'y': 0.4916235}\n", "{'x': 0.40846375, 'y': 0.8169275}\n", "{'x': 0.27732524, 'y': 0.5546505}\n" ] } ], "source": [ "for index, item in enumerate(reader):\n", " print(item)\n", " if index == 10:\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `Trainer` job expects 2 `input_fn` that are simple callables creating new `tf.data.Dataset`.\n", "\n", "Our `reader` does exactly that, so let's set" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "train_input_fn = reader\n", "eval_input_fn = reader" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prepro" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have datasets, we need to preprocess them before feeding data to our model. In this example, we only need to create batches of data, and allow multiple iterations over the dataset to be able to perform multiple epochs.\n", "\n", "Let's use the `prepro` module to functionally define a preprocessing function.\n", "\n", "See the [prepro reference](https://criteo.github.io/deepr/API/core.html#prepro)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "prepro_fn = deepr.prepros.Serial(\n", " deepr.prepros.Batch(batch_size=32),\n", " deepr.prepros.Repeat(10, modes=[tf.estimator.ModeKeys.TRAIN])\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As expected, the output of this prepro function is a batched dataset" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "prepro_fn(reader())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's check the result of our preprocessing by iterating over the dataset. We use the helper function `from_dataset` that creates a `reader` from any `tf.data.Dataset`, which gives us eager-like iteration over the underlying dataset." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'x': array([8.38533640e-01, 5.11415541e-01, 4.91451062e-02, 6.71378195e-01,\n", " 7.19314665e-02, 2.07208991e-01, 6.07405782e-01, 2.14489564e-01,\n", " 1.24138966e-01, 5.16671121e-01, 2.33591374e-04, 3.69159013e-01,\n", " 3.05574089e-01, 9.81275201e-01, 4.54333931e-01, 3.23030204e-01,\n", " 6.02127731e-01, 2.13016793e-01, 8.41403484e-01, 6.13585055e-01,\n", " 1.33147994e-02, 6.54389381e-01, 8.09324920e-01, 5.17527759e-01,\n", " 2.62713879e-01, 2.71976054e-01, 4.55039740e-01, 2.46606708e-01,\n", " 8.55176270e-01, 2.10764825e-01, 9.98475403e-02, 1.92478955e-01],\n", " dtype=float32), 'y': array([1.6770673e+00, 1.0228311e+00, 9.8290212e-02, 1.3427564e+00,\n", " 1.4386293e-01, 4.1441798e-01, 1.2148116e+00, 4.2897913e-01,\n", " 2.4827793e-01, 1.0333422e+00, 4.6718275e-04, 7.3831803e-01,\n", " 6.1114818e-01, 1.9625504e+00, 9.0866786e-01, 6.4606041e-01,\n", " 1.2042555e+00, 4.2603359e-01, 1.6828070e+00, 1.2271701e+00,\n", " 2.6629599e-02, 1.3087788e+00, 1.6186498e+00, 1.0350555e+00,\n", " 5.2542776e-01, 5.4395211e-01, 9.1007948e-01, 4.9321342e-01,\n", " 1.7103525e+00, 4.2152965e-01, 1.9969508e-01, 3.8495791e-01],\n", " dtype=float32)}\n" ] } ], "source": [ "for item in deepr.readers.base.from_dataset(prepro_fn(reader())):\n", " print(item)\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have a preprocessed dataset, let's build the model. \n", "\n", "The dataset yields dictionaries of tensors.\n", "\n", "The model is made of 2 main components\n", "\n", "1. `pred_fn(tensors: Dict, mode) -> Dict` operates on the dataset dictionaries, creates new tensors (the predictions).\n", "2. `loss_fn(tensors: Dict, mode) -> Dict` operates on the dataset and `pred_fn` results, creates at least one new tensor `loss`.\n", "\n", "We're going to use the `layer` module to quickly define those functions.\n", "\n", "Make sure to check the [layer reference](https://criteo.github.io/deepr/API/core.html#layer) for more information." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pred function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first part of the model is the prediction function.\n", "\n", "Here it's pretty simple : it will predict a `y_pred` with an `alpha` parameter such that `y_pred = alpha * x`\n", "\n", "We first define this as a `Multiply` layer :" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "@deepr.layers.layer(n_in=1, n_out=1)\n", "def Multiply(tensors):\n", " alpha = tf.get_variable(name=\"alpha\", shape=(), dtype=tf.float32)\n", " return alpha * tensors" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `layer` decorator creates a `Layer` class from the function, roughly equivalent to\n", "\n", "```python\n", "class Multiply:\n", " \n", " def __init__(self, n_in=1, n_out=1, inputs=None, outputs=None, name=None):\n", " self.n_in = n_in\n", " self.n_out = n_out\n", " self.inputs = inputs\n", " self.outputs = outputs\n", " self.name = name\n", " \n", " def __call__(self, tensors, mode: str):\n", " if isinstance(tensors, dict):\n", " return self.forward_as_dict(tensors, mode)\n", " else:\n", " return self.forward(tensors, mode)\n", " \n", " def forward(self, tensors, mode: str):\n", " alpha = tf.get_variable(name=\"alpha\", shape=(), dtype=tf.float32)\n", " return alpha * tensors\n", " \n", " def forward_as_dict(self, tensors: Dict, mode: str) -> Dict:\n", " return {self.outputs: self.forward(tensors[self.inputs])}\n", "```\n", "\n", "We can instantiate our `Layer` with" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "pred_fn = Multiply(inputs=\"x\", outputs=\"y_pred\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The power of the base [Layer](https://criteo.github.io/deepr/API/_autosummary/deepr.layers.Layer.html#deepr.layers.Layer) class is that layers are actually functions that can operate on both dictionaries and tuples of tensors.\n", "\n", "The `inputs` and `outputs` arguments, when given, specify the keys of the dictionaries to use for the layer.\n", "\n", "Let's see how it works" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Tensor(\"mul:0\", shape=(), dtype=float32)\n", "{'y_pred': }\n" ] } ], "source": [ "tf.reset_default_graph()\n", "print(pred_fn(tf.constant(1.0)))\n", "tf.reset_default_graph() # Remove alpha variable from the graph\n", "print(pred_fn({\"x\": tf.constant(1.0)}))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's check the output of this model (alpha is initialized randomly) :" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.6913518\n" ] } ], "source": [ "tf.reset_default_graph()\n", "y_pred = pred_fn(tf.constant(1.0))\n", "with tf.Session() as sess:\n", " sess.run(tf.global_variables_initializer())\n", " print(sess.run(y_pred))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Loss function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's then define the loss function. A squared l2 loss will work fine here, let's create a layer for this :" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "@deepr.layers.layer(n_in=2, n_out=1)\n", "def SquaredL2(tensors):\n", " x, y = tensors\n", " return tf.reduce_sum((x-y)**2)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "loss_fn = SquaredL2(inputs=(\"y_pred\", \"y\"), outputs=\"loss\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's see if it works : " ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.25\n", "{'loss': 0.25}\n" ] } ], "source": [ "with tf.Session() as sess:\n", " print(sess.run(loss_fn((tf.constant(1.0), tf.constant(0.5)))))\n", " print(sess.run(loss_fn({\"y_pred\": tf.constant(1.0), \"y\": tf.constant(0.5)})))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Optimizer" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The last thing we need is the optimizer. See the [optimizer reference](https://criteo.github.io/deepr/API/core.html#optimizer)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "optimizer_fn = deepr.optimizers.TensorflowOptimizer(\"Adam\", 0.1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Trainer job" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since all these concepts are now defined, let's create a `Trainer` job. \n", "\n", "Make sure to check the [trainer reference](https://criteo.github.io/deepr/API/_autosummary/deepr.jobs.Trainer.html#deepr.jobs.Trainer)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "job = deepr.jobs.Trainer(\n", " path_model=\"model\", \n", " pred_fn=pred_fn, \n", " loss_fn=loss_fn,\n", " optimizer_fn=optimizer_fn,\n", " train_input_fn=train_input_fn,\n", " eval_input_fn=eval_input_fn,\n", " prepro_fn=prepro_fn\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Creating the job is lazy and doesn't take any time. To run it, call the run method : " ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:deepr.prepros.core:Not applying Repeat(10) (mode=eval)\n", "INFO:deepr.jobs.trainer:Running final evaluation, using global_step = 320\n", "INFO:deepr.prepros.core:Not applying Repeat(10) (mode=eval)\n", "INFO:deepr.jobs.trainer:{'loss': 2.5611915e-12, 'global_step': 320}\n" ] } ], "source": [ "job.run()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The loss is 0, great, we now know how to multiply by 2 :)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's check alpha is indeed equal to 2 : " ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2.0000005\n" ] } ], "source": [ "experiment = job.create_experiment()\n", "estimator = experiment.estimator\n", "print(estimator.get_variable_value(\"alpha\"))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" }, "pycharm": { "stem_cell": { "cell_type": "raw", "metadata": { "collapsed": false }, "source": [] } } }, "nbformat": 4, "nbformat_minor": 1 }