[ ]:
!pip install deepr[cpu]
Config
It is possible (and hopefully intuitive) to translate code into a simple config syntax based off native python types (dictionaries, tuples, lists, etc.).
The config system builds upon ideas from Thinc and gin-config as follows
Support arbitrary tree of objects (like
Thinc
), but not arbirary dependency injection likegin-config
Re-use macro syntax “$macro:name” from
Thinc
Re-use special keyword “@reference” for references from
gin-config
Design Requirements
Features
Trees of class instances of any type.
Static macros (simple parameter values).
Dynamic macros (parameter values created at run time, eg. the MLFlow run id).
Positional and keyword arguments support.
References to the config and macros themselves for future use (upload to MLFlow for example).
Config evaluation mode: allow attributes to be configs, delegating objects instantiation to the parent.
Easy serialization to
json
.No “python magic”: avoid having registries and hidden configuration in package level variables that would not be passed along when sending jobs to remote machines (configs should be self-contained).
No special
Config
class (ideally dictionaries should be enough).Unicity: there should ideally be only one way to write a config for a given object.
Don’t support
Classes : objects in config are necessarily instances of classes or literals, i.e.
dict
,tuple
,float
,int
, etc.Object references and scoping : don’t allow references to other objects defined in the config as it might lead to confusions (singleton issue). It will also break unicity as there would be mutiple ways to define the “same” config.
Singletons : the singleton problem can be (and should be) solved in the code rather than in the config.
[1]:
import logging
import sys
logging.basicConfig(level=logging.INFO, stream=sys.stdout)
logging.getLogger("tensorflow").setLevel(logging.CRITICAL)
import deepr
Arbitrary tree of objects
A Config
is any arbitrary nested dictionary that corresponds to a tree of python objects.
An instance of a class can be described as a dictionary with the following special key
type
: the full import string of the instance’s class to be created.
All other keys will be treated as keyword arguments at instantiation time.
For example if you have the following class and configuration
[2]:
class Model:
def __init__(self, learning_rate):
self.learning_rate = learning_rate
config = {
"type": "__main__.Model",
"learning_rate": 0.1
}
parsed = deepr.parse_config(config)
model = deepr.from_config(parsed)
model.learning_rate
[2]:
0.1
Macros
It is possible to define macro variables and reference them in the config.
For example, given the following macros (there is one macro params
with one parameter learning_rate
):
A config can use the learning rate using the $macro:param
convention:
[3]:
macros = {
"params": {
"learning_rate": 0.1
}
}
config = {
"type": "Model",
"learning_rate": "$params:learning_rate"
}
print(deepr.parse_config(config, macros))
{'type': 'Model', 'learning_rate': 0.1}
Dynamic Macros
For more advanced uses, it is possible to define dynamic macros as configs of instances of classes inheriting dict.
For example
[4]:
from datetime import datetime
class DateMacro(dict):
def __init__(self):
super().__init__(year=datetime.now().strftime('%Y'))
class Job:
def __init__(self, year):
self.year = year
macros = {
"date": {
"type": "__main__.DateMacro"
}
}
config = {
"type": "__main__.Job",
"year": "$date:year"
}
parsed = deepr.parse_config(config, macros)
print(parsed)
job = deepr.from_config(parsed)
print(job.year)
{'type': '__main__.Job', 'year': '2020'}
2020
Positional Arguments
Use the special key *
for positional arguments.
For example if you have the following class and configuration
[5]:
class Model:
def __init__(self, *layers):
self.layers = layers
config = {
"type": "__main__.Model",
"*": [1, 2, 3]
}
parsed = deepr.parse_config(config)
model = deepr.from_config(parsed)
print(model.layers)
(1, 2, 3)
Argument as config
In some cases, the argument to be provided to an instance should be left as a config. To specify that a block should not be instantiated but kept as a dict config, use the special key:
eval
(Optional): can take 3 values:"call"
: call the class or function referenced by “type” with the provided arguments."partial"
: return a callable equivalent to the callable referenced by “type” with pre-filled arguments.None
: the dictionary will be kept as is and no instance will be created.
For example if you have the following class and configuration
[6]:
class Job:
def __init__(self, model_config):
self.model_config = model_config
def run(self):
# Job is responsible for instantiating the model from the config
model = from_config(self.model_config)
config = {
"type": "__main__.Job",
"model_config": {
"type": "__main__.Model",
"eval": None,
"*": [1, 2, 3]
}
}
parsed = deepr.parse_config(config)
job = deepr.from_config(parsed)
print(job.model_config)
{'type': '__main__.Model', '*': [1, 2, 3]}
References
Though references to other objects defined in the tree is not possible, it is possible to use 3 special references
@self
: refers to the unparsed config@macros
: refers to the unparsed macros@macros_eval
: refers to the evaluated macros
For example
[7]:
macros = {
"date": {
"type": "__main__.DateMacro"
},
"params": {
"learning_rate": 0.1
}
}
config = {
"config": "@self",
"macros": "@macros",
"macros_eval": "@macros_eval"
}
parsed = deepr.parse_config(config, macros)
print(parsed["config"])
print(parsed["macros"])
print(parsed["macros_eval"])
WARNING:deepr.config.base:- MACRO PARAM NOT USED: macro = 'date', param = 'year'
WARNING:deepr.config.base:- MACRO PARAM NOT USED: macro = 'params', param = 'learning_rate'
{'config': '@self', 'macros': '@macros', 'macros_eval': '@macros_eval', 'eval': None}
{'date': {'type': '__main__.DateMacro'}, 'params': {'learning_rate': 0.1}, 'eval': None}
{'date': {'year': '2020'}, 'params': {'learning_rate': 0.1}, 'eval': None}
The intended use of these references is mainly for logging and should be used sparingly.