[ ]:
!pip install deepr[cpu]

Config

It is possible (and hopefully intuitive) to translate code into a simple config syntax based off native python types (dictionaries, tuples, lists, etc.).

The config system builds upon ideas from Thinc and gin-config as follows

  • Support arbitrary tree of objects (like Thinc), but not arbirary dependency injection like gin-config

  • Re-use macro syntax “$macro:name” from Thinc

  • Re-use special keyword “@reference” for references from gin-config

Design Requirements

Features

  • Trees of class instances of any type.

  • Static macros (simple parameter values).

  • Dynamic macros (parameter values created at run time, eg. the MLFlow run id).

  • Positional and keyword arguments support.

  • References to the config and macros themselves for future use (upload to MLFlow for example).

  • Config evaluation mode: allow attributes to be configs, delegating objects instantiation to the parent.

  • Easy serialization to json.

  • No “python magic”: avoid having registries and hidden configuration in package level variables that would not be passed along when sending jobs to remote machines (configs should be self-contained).

  • No special Config class (ideally dictionaries should be enough).

  • Unicity: there should ideally be only one way to write a config for a given object.

Don’t support

  • Classes : objects in config are necessarily instances of classes or literals, i.e. dict, tuple, float, int, etc.

  • Object references and scoping : don’t allow references to other objects defined in the config as it might lead to confusions (singleton issue). It will also break unicity as there would be mutiple ways to define the “same” config.

  • Singletons : the singleton problem can be (and should be) solved in the code rather than in the config.

[1]:
import logging
import sys
logging.basicConfig(level=logging.INFO, stream=sys.stdout)
logging.getLogger("tensorflow").setLevel(logging.CRITICAL)

import deepr

Arbitrary tree of objects

A Config is any arbitrary nested dictionary that corresponds to a tree of python objects.

An instance of a class can be described as a dictionary with the following special key

  • type: the full import string of the instance’s class to be created.

All other keys will be treated as keyword arguments at instantiation time.

For example if you have the following class and configuration

[2]:
class Model:
    def __init__(self, learning_rate):
        self.learning_rate = learning_rate

config = {
    "type": "__main__.Model",
    "learning_rate": 0.1
}

parsed = deepr.parse_config(config)
model = deepr.from_config(parsed)
model.learning_rate
[2]:
0.1

Macros

It is possible to define macro variables and reference them in the config.

For example, given the following macros (there is one macro params with one parameter learning_rate):

A config can use the learning rate using the $macro:param convention:

[3]:
macros = {
    "params": {
        "learning_rate": 0.1
    }
}

config = {
    "type": "Model",
    "learning_rate": "$params:learning_rate"
}

print(deepr.parse_config(config, macros))
{'type': 'Model', 'learning_rate': 0.1}

Dynamic Macros

For more advanced uses, it is possible to define dynamic macros as configs of instances of classes inheriting dict.

For example

[4]:
from datetime import datetime


class DateMacro(dict):
    def __init__(self):
        super().__init__(year=datetime.now().strftime('%Y'))


class Job:
    def __init__(self, year):
        self.year = year

macros = {
    "date": {
        "type": "__main__.DateMacro"
     }
}
config = {
    "type": "__main__.Job",
    "year": "$date:year"
}
parsed = deepr.parse_config(config, macros)
print(parsed)
job = deepr.from_config(parsed)
print(job.year)
{'type': '__main__.Job', 'year': '2020'}
2020

Positional Arguments

Use the special key * for positional arguments.

For example if you have the following class and configuration

[5]:
class Model:
    def __init__(self, *layers):
        self.layers = layers

config = {
    "type": "__main__.Model",
    "*": [1, 2, 3]
}
parsed = deepr.parse_config(config)
model = deepr.from_config(parsed)
print(model.layers)
(1, 2, 3)

Argument as config

In some cases, the argument to be provided to an instance should be left as a config. To specify that a block should not be instantiated but kept as a dict config, use the special key:

  • eval (Optional): can take 3 values:

    • "call": call the class or function referenced by “type” with the provided arguments.

    • "partial": return a callable equivalent to the callable referenced by “type” with pre-filled arguments.

    • None: the dictionary will be kept as is and no instance will be created.

For example if you have the following class and configuration

[6]:
class Job:
    def __init__(self, model_config):
        self.model_config = model_config

    def run(self):
        # Job is responsible for instantiating the model from the config
        model = from_config(self.model_config)

config = {
    "type": "__main__.Job",
    "model_config": {
        "type": "__main__.Model",
        "eval": None,
        "*": [1, 2, 3]
     }
}

parsed = deepr.parse_config(config)
job = deepr.from_config(parsed)
print(job.model_config)
{'type': '__main__.Model', '*': [1, 2, 3]}

References

Though references to other objects defined in the tree is not possible, it is possible to use 3 special references

  • @self: refers to the unparsed config

  • @macros: refers to the unparsed macros

  • @macros_eval: refers to the evaluated macros

For example

[7]:
macros = {
    "date": {
        "type": "__main__.DateMacro"
     },
    "params": {
        "learning_rate": 0.1
    }
}

config = {
    "config": "@self",
    "macros": "@macros",
    "macros_eval": "@macros_eval"
}

parsed = deepr.parse_config(config, macros)
print(parsed["config"])
print(parsed["macros"])
print(parsed["macros_eval"])
WARNING:deepr.config.base:- MACRO PARAM NOT USED: macro = 'date', param = 'year'
WARNING:deepr.config.base:- MACRO PARAM NOT USED: macro = 'params', param = 'learning_rate'
{'config': '@self', 'macros': '@macros', 'macros_eval': '@macros_eval', 'eval': None}
{'date': {'type': '__main__.DateMacro'}, 'params': {'learning_rate': 0.1}, 'eval': None}
{'date': {'year': '2020'}, 'params': {'learning_rate': 0.1}, 'eval': None}

The intended use of these references is mainly for logging and should be used sparingly.