deepr.jobs package
Submodules
deepr.jobs.base module
Interface for jobs
deepr.jobs.cleanup_checkpoints module
Cleanup Checkpoints in path_model
deepr.jobs.combinators module
Pipeline
deepr.jobs.copy_dir module
Copy Directory
deepr.jobs.evaluate module
Evaluate Job.
deepr.jobs.export_xla_model_metadata module
Export xla compatible model metadata from a saved model
- class deepr.jobs.export_xla_model_metadata.ExportXlaModelMetadata(path_optimized_model, path_metadata, graph_name, metadata_name, feed_shapes, fetch_shapes)[source]
Bases:
Job
Export xla compatible model metadata from a saved model
- path_optimized_model
Path to directory containing optimized saved model exports to convert
- Type:
deepr.jobs.log_metric module
Log Metric Job
deepr.jobs.mlflow_save_configs module
Upload Configs to MLFlow
- class deepr.jobs.mlflow_save_configs.MLFlowFormatter(include_keys=None, skip_keys=(), skip_values=())[source]
Bases:
object
Flattens dictionaries and extract sub-keys
Example
>>> from deepr.jobs import MLFlowFormatter >>> params = { ... "foo": { ... "type": "foo.Foo", ... "bar": { ... "x": 1, ... "y": 2, ... } ... } ... } >>> formatter = MLFlowFormatter(include_keys=("bar", "x"), skip_values=(2,)) >>> formatter(params) {'bar.x': 1, 'x': 1}
deepr.jobs.mlflow_save_info module
Save MLFlow info to path
deepr.jobs.optimize_saved_model module
Converts SavedModel into an optimized protobuf for inference
- class deepr.jobs.optimize_saved_model.OptimizeSavedModel(path_saved_model, path_optimized_model, graph_name, feeds, fetch, new_names=<factory>, blacklisted_variables=<factory>)[source]
Bases:
Job
Converts SavedModel into an optimized protobuf for inference
This job reads the input SavedModel, rename some nodes using the
new_names
argument (raises an error if some renames cannot be found), create placeholders given byfeeds
(and removes all other placeholders not in this list), and finally freezes the sub graph that produces the output tensorfetch
.When creating the original SavedModel, it is recommended to use
tf.identity
operators to mark some tensors as future feeds or fetches.WARNING: successful completion of this job is no guarantee that the exported Graph is correct. It is recommended to test the export in a separate job.
- blacklisted_variables
List of variable names not to include in the export
- exception deepr.jobs.optimize_saved_model.TensorsNotFoundError(tensors)[source]
Bases:
ValueError
No tensors found for those names.
Most Tensorflow operators take a
name
argument that should be used if you want to use a custom name for a tensor, otherwise a default will be used.If accessing the operator creating a Tensor is not possible, you can use
tf.identity
to name the Tensor. However, note that it adds a new op to the graph that creates a copy of the input Tensor, and thus should be limited to avoid overhead.
- deepr.jobs.optimize_saved_model.make_placeholders(graph_def, names)[source]
Create placeholders for names and remove other placeholders
- Parameters:
graph_def (tf.GraphDef) – Graph definition
names (List[str]) – Names of placeholders to keep / create for this graph
- Returns:
A copy of the input GraphDef with new placeholders
- Return type:
tf.GraphDef
- Raises:
TensorsNotFoundError – If names refers to a node that is not present
- deepr.jobs.optimize_saved_model.rename_nodes(graph_def, new_names)[source]
Rename items in the graph to new ones defined in new_names
- Parameters:
- Returns:
A copy of the input GraphDef with renamed nodes
- Return type:
tf.GraphDef
- Raises:
TensorsNotFoundError – If new_names refers to an node not found in graph_def
deepr.jobs.params_tuner module
Hyper parameter tuning
- class deepr.jobs.params_tuner.GridSampler(param_grid, repeat=1)[source]
Bases:
Sampler
Grid Sampler wrapping ParameterGrid from sklearn
- class deepr.jobs.params_tuner.ParamsSampler(param_grid, n_iter=10, repeat=1, seed=None)[source]
Bases:
Sampler
Parameter Sampler
deepr.jobs.save_dataset module
Save Dataset Job.
deepr.jobs.trainer module
Train Job
- class deepr.jobs.trainer.ConfigProto(inter_op_parallelism_threads=16, intra_op_parallelism_threads=16, log_device_placement=False, gpu_device_count=0, cpu_device_count=16, **kwargs)[source]
Bases:
dict
Named Dict for ConfigProto arguments with reasonable defaults.
- class deepr.jobs.trainer.EvalSpec(steps=None, name=None, start_delay_secs=120, throttle_secs=100)[source]
Bases:
dict
Named Dict for EvalSpec arguments with reasonable defaults.
- class deepr.jobs.trainer.FinalSpec[source]
Bases:
dict
Named Dict for final evaluation with reasonable defaults.
- class deepr.jobs.trainer.TrainSpec(max_steps=None)[source]
Bases:
dict
Named Dict for TrainSpec arguments with reasonable defaults.
- class deepr.jobs.trainer.Trainer(path_model, pred_fn, loss_fn, optimizer_fn, train_input_fn, eval_input_fn, prepro_fn=<function Trainer.<lambda>>, initializer_fn=<function Trainer.<lambda>>, exporters=<factory>, train_metrics=<factory>, eval_metrics=<factory>, final_metrics=<factory>, train_hooks=<factory>, eval_hooks=<factory>, final_hooks=<factory>, train_spec=<factory>, eval_spec=<factory>, final_spec=<factory>, run_config=<factory>, config_proto=<factory>, random_seed=42, preds=<factory>)[source]
Bases:
TrainerBase
Train and evaluate a tf.Estimator on the current machine.
- pred_fn
Typically a
Layer
instance, but in general, any callable.- Its signature is the following:
- featuresDict
Features, yielded by the dataset
- predictionsDict
Predictions
- loss_fn
Typically a
Layer
instance, but in general, any callable.- Its signature is the following:
- features_and_predictionsDict
Features and predictions combined
- lossesDict
Losses and metrics
The value for key “loss” from the output dictionary is then fed to the optimizer_fn.
- optimizer_fn
Typically an
Optimizer
instance, but in general, any callable.- Its signature is the following:
- inputsDict[str, tf.Tensor]
Typically has key “loss”`
- outputsDict[str, tf.Tensor]
Need key “train_op”
- train_input_fn
Typically a
Reader
instance, but in general, any callable.Used for training.
- Its signature is the following:
- outputstf.data.Dataset
A newly created dataset. Each call to the input_fn should create a new dataset and a new graph.
- Type:
Callable[[], tf.data.Dataset]
- eval_input_fn
Typically a
Reader
instance, but in general, any callable.Used for evaluation.
- Its signature is the following:
- outputstf.data.Dataset
A newly created dataset. Each call to the input_fn should create a new dataset and a new graph.
- Type:
Callable[[], tf.data.Dataset]
- prepro_fn
Typically a
Prepro
instance, but in general, any callable.- Its signature is the following:
- inputs :
- datasettf.data.Dataset
Created by train_input_fn or eval_input_fn.
- modestr
One of tf.estimator.ModeKeys.TRAIN, PREDICT or EVAL
- outputstf.data.Dataset
The preprocessed dataset
- Type:
Callable[[tf.data.Dataset, str], tf.data.Dataset], Optional
- initializer_fn
Any Callable that sets up initialization by adding an op to the default Graph.
- Type:
Callable[[], None], Optional
- train_metrics
Typically,
Metric
instances, but in general, any callables.Used for training.
- Each callable must have the following signature:
- inputsDict
Features, Predictions and Losses dictionary
- outputsDict[str, Tuple]
Dictionary of tuples of (tensor_value, update_op).
- Type:
List[Callable], Optional
- eval_metrics
Typically,
Metric
instances, but in general, any callables.Used for evaluation.
- Each callable must have the following signature:
- inputsDict
Features, Predictions and Losses dictionary
- outputsDict[str, Tuple]
Dictionary of tuples of (tensor_value, update_op).
- Type:
List[Callable], Optional
- exporters
Typically,
Exporter
instances, but in general, any callables.Used at the end of training on the trained
Estimator
.- Each callable must have the following signature:
- inputstf.estimator.Estimator
A trained Estimator.
- Type:
List[Callable], Optional
- train_hooks
List of Hooks or HookFactories.
Used for training.
Some hook can be fully defined during instantiation of Trainer, for example a
StepsPerSecHook
. However, other hooks requires objects to be instantiated that will only be created after running theTrainer
.The hooks module defines factories for more complicated hooks.
- Type:
List, Optional
- eval_hooks
List of Hooks or HookFactories.
Used for evaluation.
Some hook can be fully defined during instantiation of Trainer, for example a
StepsPerSecHook
. However, other hooks requires objects to be instantiated that will only be created after running theTrainer
.The hooks module defines factories for more complicated hooks.
- Type:
List, Optional
- eval_spec
Optional parameters for
EvalSpec
.- Type:
Dict, Optional
- train_spec
Optional parameters for
TrainSpec
.- Type:
Dict, Optional
- run_config
Optional parameters for
RunConfig
.- Type:
Dict, Optional
- config_proto
Optional parameters for
RunConfig
.- Type:
Dict, Optional
- create_experiment()[source]
Create an Experiment object packaging Estimator and Specs.
- Returns:
estimator : tf.estimator.Estimator train_spec : tf.estimator.TrainSpec eval_spec : tf.estimator.EvalSpec
- Return type:
Experiment (NamedTuple)
- initializer_fn()
- prepro_fn(_)
deepr.jobs.trainer_base module
Trainer Base.
deepr.jobs.trainer_keras module
Keras trainer.
- class deepr.jobs.trainer_keras.TrainerKeras(path_model, model, train_input_fn, eval_input_fn, prepro_fn=<function TrainerKeras.<lambda>>, exporters=<factory>, train_hooks=<factory>, eval_hooks=<factory>, final_hooks=<factory>, train_spec=<factory>, eval_spec=<factory>, final_spec=<factory>, run_config=<factory>, config_proto=<factory>, random_seed=42)[source]
Bases:
TrainerBase
Keras trainer.
- create_experiment()[source]
Create an Experiment object packaging Estimator and Specs.
- Returns:
estimator : tf.estimator.Estimator train_spec : tf.estimator.TrainSpec eval_spec : tf.estimator.EvalSpec
- Return type:
Experiment (NamedTuple)
- model: Model
- prepro_fn(_)
deepr.jobs.yarn_config module
Basic Yarn Config in charge of uploading environment.
- class deepr.jobs.yarn_config.YarnConfig(name, gpu_additional_packages=('tensorflow-gpu==1.15.2', 'tf-yarn-gpu==0.4.19'), gpu_ignored_packages=('tensorflow', 'tf-yarn'), cpu_ignored_packages=(), gpu_to_use=None, jvm_memory_in_gb=8, path_pex_cpu=None, path_pex_gpu=None, path_pex_prefix='viewfs://root/user/runner/envs')[source]
Bases:
object
Basic Yarn Config in charge of uploading environment.
deepr.jobs.yarn_launcher module
Yarn Launcher Config Interface and Job
- class deepr.jobs.yarn_launcher.YarnLauncher(job, config, run_on_yarn=True, use_mlflow=False)[source]
Bases:
Job
Packages current environment, upload .pex and run yarn job.
- config: YarnLauncherConfig
- class deepr.jobs.yarn_launcher.YarnLauncherConfig(name='yarn-launcher-2023-03-07-14-05-28', gpu_additional_packages=('tensorflow-gpu==1.15.2', 'tf-yarn-gpu==0.4.19'), gpu_ignored_packages=('tensorflow', 'tf-yarn'), cpu_ignored_packages=(), gpu_to_use=None, jvm_memory_in_gb=8, path_pex_cpu=None, path_pex_gpu=None, path_pex_prefix='viewfs://root/user/runner/envs', hadoop_file_systems=(), memory='48 GiB', num_cores=48)[source]
Bases:
YarnConfig
Yarn Launcher Config.
deepr.jobs.yarn_trainer module
Yarn Trainer Config and Job
- class deepr.jobs.yarn_trainer.YarnTrainer(trainer, config, train_on_yarn=True)[source]
Bases:
Job
Run a
TrainerBase
job on yarn.- config: YarnTrainerConfig
- class deepr.jobs.yarn_trainer.YarnTrainerConfig(name='yarn-trainer-2023-03-07-14-05-28', gpu_additional_packages=('tensorflow-gpu==1.15.2', 'tf-yarn-gpu==0.4.19'), gpu_ignored_packages=('tensorflow', 'tf-yarn'), cpu_ignored_packages=(), gpu_to_use=None, jvm_memory_in_gb=8, path_pex_cpu=None, path_pex_gpu=None, path_pex_prefix='viewfs://root/user/runner/envs', nb_ps=None, nb_retries=0, nb_workers=None, pre_script_hook='source /etc/profile.d/cuda.sh && setupCUDA 10.1 && setupCUDNN cuda10.1_v7.6.4.38', queue='dev', tf_yarn='cpu', tf_yarn_chief_cores=48, tf_yarn_chief_memory='48 GiB', tf_yarn_evaluator_cores=48, tf_yarn_evaluator_memory='48 GiB', tf_yarn_tensorboard_memory='48 GiB')[source]
Bases:
YarnConfig
Default Yarn Trainer Config