Runners¶

Runner
SupervisedRunner

class catalyst.runners.Runner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, experiment_fn: Callable = <class 'catalyst.experiments.experiment.Experiment'>)[source]¶

Bases: catalyst.core.runner.IStageBasedRunner

Deep Learning Runner for supervised, unsupervised, gan, etc runs.

__init__(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, experiment_fn: Callable = <class 'catalyst.experiments.experiment.Experiment'>)[source]¶

Parameters

model – Torch model object
device – Torch device
experiment_fn – callable function, which defines default experiment type to use during .train and .infer methods.

infer(*, model: torch.nn.modules.module.Module, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, logdir: str = None, resume: str = None, verbose: bool = False, stage_kwargs: Dict = None, fp16: Union[Dict, bool] = None, check: bool = False, timeit: bool = False, initial_seed: int = 42, state_kwargs: Dict = None) → None[source]¶

Starts the inference stage of the model.

Parameters

model – model for inference
datasets (OrderedDict[str, Union[Dataset, Dict, Any]]) – dictionary with one or several torch.utils.data.Dataset for training, validation or inference used for Loaders automatic creation preferred way for distributed training setup
loaders (OrderedDict[str, DataLoader]) – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference
callbacks (Union[List[Callback], OrderedDict[str, Callback]]) – list or dictionary with Catalyst callbacks
logdir – path to output directory
resume – path to checkpoint to use for resume
verbose – if True, it displays the status of the training to the console.
stage_kwargs – additional stage params
fp16 (Union[Dict, bool]) – If not None, then sets training to FP16. See https://nvidia.github.io/apex/amp.html#properties if fp16=True, params by default will be {"opt_level": "O1"}
check – if True, then only checks that pipeline is working (3 epochs only)
timeit – if True, computes the execution time of training process and displays it to the console.
initial_seed – experiment’s initial seed value
state_kwargs – deprecated, use stage_kwargs instead

Raises

NotImplementedError – if both resume and CheckpointCallback already exist

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]¶

Run model inference on specified data batch.

Parameters

batch (Mapping[str, Any]) – dictionary with data batches from DataLoader.
**kwargs – additional kwargs to pass to the model

# noqa: DAR202 :returns: model output dictionary :rtype: Mapping[str, Any]

Raises: NotImplementedError – if not implemented yet

predict_loader(*, loader: torch.utils.data.dataloader.DataLoader, model: torch.nn.modules.module.Module = None, resume: str = None, fp16: Union[Dict, bool] = None, initial_seed: int = 42) → Generator[source]¶

Runs model inference on PyTorch Dataloader and returns python generator with model predictions from runner.predict_batch. Cleans up the experiment info to avoid possible collisions. Sets is_train_loader and is_valid_loader to False while keeping is_infer_loader as True. Moves model to evaluation mode.

Parameters

loader – loader to predict
model – model to use for prediction
resume – path to checkpoint to resume
fp16 (Union[Dict, bool]) – fp16 usage flag
initial_seed – seed to use before prediction

Yields

bathes with model predictions

trace(*, model: torch.nn.modules.module.Module = None, batch: Any = None, logdir: str = None, loader: torch.utils.data.dataloader.DataLoader = None, method_name: str = 'forward', mode: str = 'eval', requires_grad: bool = False, fp16: Union[Dict, bool] = None, device: Union[str, torch.device] = 'cpu', predict_params: dict = None) → torch.jit._script.ScriptModule[source]¶

Traces model using Torch Jit.

Parameters

model – model to trace
batch – batch to forward through the model to trace
logdir (str, optional) – If specified, the result will be written to the directory
loader (DataLoader, optional) – if batch is not specified, the batch will be next(iter(loader))
method_name – model’s method name that will be traced
mode – train or eval
requires_grad – flag to trace with gradients
fp16 (Union[Dict, bool]) – If not None, then sets tracing params to FP16
device – Torch device or a string
predict_params – additional parameters for model forward

Returns

traced model

Return type

ScriptModule

Raises

ValueError – if batch and loader are Nones

train(*, model: torch.nn.modules.module.Module, criterion: torch.nn.modules.module.Module = None, optimizer: torch.optim.optimizer.Optimizer = None, scheduler: torch.optim.lr_scheduler._LRScheduler = None, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, logdir: str = None, resume: str = None, num_epochs: int = 1, valid_loader: str = 'valid', main_metric: str = 'loss', minimize_metric: bool = True, verbose: bool = False, stage_kwargs: Dict = None, checkpoint_data: Dict = None, fp16: Union[Dict, bool] = None, distributed: bool = False, check: bool = False, overfit: bool = False, timeit: bool = False, load_best_on_end: bool = False, initial_seed: int = 42, state_kwargs: Dict = None) → None[source]¶

Starts the train stage of the model.

Parameters

model – model to train
criterion – criterion function for training
optimizer – optimizer for training
scheduler – scheduler for training
datasets (OrderedDict[str, Union[Dataset, Dict, Any]]) – dictionary with one or several torch.utils.data.Dataset for training, validation or inference used for Loaders automatic creation preferred way for distributed training setup
loaders (OrderedDict[str, DataLoader]) – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference
callbacks (Union[List[Callback], OrderedDict[str, Callback]]) – list or dictionary with Catalyst callbacks
logdir – path to output directory
resume – path to checkpoint for model
num_epochs – number of training epochs
valid_loader – loader name used to calculate the metrics and save the checkpoints. For example, you can pass train and then the metrics will be taken from train loader.
main_metric – the key to the name of the metric by which the checkpoints will be selected.
minimize_metric – flag to indicate whether the main_metric should be minimized.
verbose – if True, it displays the status of the training to the console.
stage_kwargs – additional params for stage
checkpoint_data – additional data to save in checkpoint, for example: class_names, date_of_training, etc
fp16 (Union[Dict, bool]) – If not None, then sets training to FP16. See https://nvidia.github.io/apex/amp.html#properties if fp16=True, params by default will be {"opt_level": "O1"}
distributed – if True will start training in distributed mode. Note: Works only with python scripts. No jupyter support.
check – if True, then only checks that pipeline is working (3 epochs only with 3 batches per loader)
overfit – if True, then takes only one batch per loader for model overfitting, for advance usage please check BatchOverfitCallback
timeit – if True, computes the execution time of training process and displays it to the console.
load_best_on_end – if True, Runner will load best checkpoint state (model, optimizer, etc) according to validation metrics. Requires specified logdir.
initial_seed – experiment’s initial seed value
state_kwargs – deprecated, use stage_kwargs instead

Raises

NotImplementedError – if both resume and CheckpointCallback already exist

class catalyst.runners.SupervisedRunner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, input_key: Any = 'features', output_key: Any = 'logits', input_target_key: str = 'targets', experiment_fn: Callable = <class 'catalyst.experiments.supervised.SupervisedExperiment'>)[source]¶

Bases: catalyst.runners.runner.Runner

Runner for experiments with supervised model.

__init__(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, input_key: Any = 'features', output_key: Any = 'logits', input_target_key: str = 'targets', experiment_fn: Callable = <class 'catalyst.experiments.supervised.SupervisedExperiment'>)[source]¶

Parameters

model – Torch model object
device – Torch device
input_key – Key in batch dict mapping for model input
output_key – Key in output dict model output will be stored under
input_target_key – Key in batch dict mapping for target
experiment_fn – callable function, which defines default experiment type to use during .train and .infer methods.

forward(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]¶

Forward method for your Runner. Should not be called directly outside of runner. If your model has specific interface, override this method to use it

Parameters

batch (Mapping[str, Any]) – dictionary with data batches from DataLoaders.
**kwargs – additional parameters to pass to the model

Returns

dict with model output batch

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]¶

Run model inference on specified data batch.

Warning

You should not override this method. If you need specific model call, override forward() method

Parameters

batch (Mapping[str, Any]) – dictionary with data batches from DataLoader.
**kwargs – additional kwargs to pass to the model

Returns

model output dictionary

Return type

Mapping[str, Any]

Runner ¶

class catalyst.runners.runner.Runner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, experiment_fn: Callable = <class 'catalyst.experiments.experiment.Experiment'>)[source]¶

Bases: catalyst.core.runner.IStageBasedRunner

Deep Learning Runner for supervised, unsupervised, gan, etc runs.

__init__(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, experiment_fn: Callable = <class 'catalyst.experiments.experiment.Experiment'>)[source]¶

Parameters

model – Torch model object
device – Torch device
experiment_fn – callable function, which defines default experiment type to use during .train and .infer methods.

infer(*, model: torch.nn.modules.module.Module, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, logdir: str = None, resume: str = None, verbose: bool = False, stage_kwargs: Dict = None, fp16: Union[Dict, bool] = None, check: bool = False, timeit: bool = False, initial_seed: int = 42, state_kwargs: Dict = None) → None[source]¶

Starts the inference stage of the model.

Parameters

model – model for inference
datasets (OrderedDict[str, Union[Dataset, Dict, Any]]) – dictionary with one or several torch.utils.data.Dataset for training, validation or inference used for Loaders automatic creation preferred way for distributed training setup
loaders (OrderedDict[str, DataLoader]) – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference
callbacks (Union[List[Callback], OrderedDict[str, Callback]]) – list or dictionary with Catalyst callbacks
logdir – path to output directory
resume – path to checkpoint to use for resume
verbose – if True, it displays the status of the training to the console.
stage_kwargs – additional stage params
fp16 (Union[Dict, bool]) – If not None, then sets training to FP16. See https://nvidia.github.io/apex/amp.html#properties if fp16=True, params by default will be {"opt_level": "O1"}
check – if True, then only checks that pipeline is working (3 epochs only)
timeit – if True, computes the execution time of training process and displays it to the console.
initial_seed – experiment’s initial seed value
state_kwargs – deprecated, use stage_kwargs instead

Raises

NotImplementedError – if both resume and CheckpointCallback already exist

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]¶

Run model inference on specified data batch.

Parameters

batch (Mapping[str, Any]) – dictionary with data batches from DataLoader.
**kwargs – additional kwargs to pass to the model

# noqa: DAR202 :returns: model output dictionary :rtype: Mapping[str, Any]

Raises: NotImplementedError – if not implemented yet

predict_loader(*, loader: torch.utils.data.dataloader.DataLoader, model: torch.nn.modules.module.Module = None, resume: str = None, fp16: Union[Dict, bool] = None, initial_seed: int = 42) → Generator[source]¶

Runs model inference on PyTorch Dataloader and returns python generator with model predictions from runner.predict_batch. Cleans up the experiment info to avoid possible collisions. Sets is_train_loader and is_valid_loader to False while keeping is_infer_loader as True. Moves model to evaluation mode.

Parameters

loader – loader to predict
model – model to use for prediction
resume – path to checkpoint to resume
fp16 (Union[Dict, bool]) – fp16 usage flag
initial_seed – seed to use before prediction

Yields

bathes with model predictions

trace(*, model: torch.nn.modules.module.Module = None, batch: Any = None, logdir: str = None, loader: torch.utils.data.dataloader.DataLoader = None, method_name: str = 'forward', mode: str = 'eval', requires_grad: bool = False, fp16: Union[Dict, bool] = None, device: Union[str, torch.device] = 'cpu', predict_params: dict = None) → torch.jit._script.ScriptModule[source]¶

Traces model using Torch Jit.

Parameters

model – model to trace
batch – batch to forward through the model to trace
logdir (str, optional) – If specified, the result will be written to the directory
loader (DataLoader, optional) – if batch is not specified, the batch will be next(iter(loader))
method_name – model’s method name that will be traced
mode – train or eval
requires_grad – flag to trace with gradients
fp16 (Union[Dict, bool]) – If not None, then sets tracing params to FP16
device – Torch device or a string
predict_params – additional parameters for model forward

Returns

traced model

Return type

ScriptModule

Raises

ValueError – if batch and loader are Nones

train(*, model: torch.nn.modules.module.Module, criterion: torch.nn.modules.module.Module = None, optimizer: torch.optim.optimizer.Optimizer = None, scheduler: torch.optim.lr_scheduler._LRScheduler = None, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, logdir: str = None, resume: str = None, num_epochs: int = 1, valid_loader: str = 'valid', main_metric: str = 'loss', minimize_metric: bool = True, verbose: bool = False, stage_kwargs: Dict = None, checkpoint_data: Dict = None, fp16: Union[Dict, bool] = None, distributed: bool = False, check: bool = False, overfit: bool = False, timeit: bool = False, load_best_on_end: bool = False, initial_seed: int = 42, state_kwargs: Dict = None) → None[source]¶

Starts the train stage of the model.

Parameters

model – model to train
criterion – criterion function for training
optimizer – optimizer for training
scheduler – scheduler for training
datasets (OrderedDict[str, Union[Dataset, Dict, Any]]) – dictionary with one or several torch.utils.data.Dataset for training, validation or inference used for Loaders automatic creation preferred way for distributed training setup
loaders (OrderedDict[str, DataLoader]) – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference
callbacks (Union[List[Callback], OrderedDict[str, Callback]]) – list or dictionary with Catalyst callbacks
logdir – path to output directory
resume – path to checkpoint for model
num_epochs – number of training epochs
valid_loader – loader name used to calculate the metrics and save the checkpoints. For example, you can pass train and then the metrics will be taken from train loader.
main_metric – the key to the name of the metric by which the checkpoints will be selected.
minimize_metric – flag to indicate whether the main_metric should be minimized.
verbose – if True, it displays the status of the training to the console.
stage_kwargs – additional params for stage
checkpoint_data – additional data to save in checkpoint, for example: class_names, date_of_training, etc
fp16 (Union[Dict, bool]) – If not None, then sets training to FP16. See https://nvidia.github.io/apex/amp.html#properties if fp16=True, params by default will be {"opt_level": "O1"}
distributed – if True will start training in distributed mode. Note: Works only with python scripts. No jupyter support.
check – if True, then only checks that pipeline is working (3 epochs only with 3 batches per loader)
overfit – if True, then takes only one batch per loader for model overfitting, for advance usage please check BatchOverfitCallback
timeit – if True, computes the execution time of training process and displays it to the console.
load_best_on_end – if True, Runner will load best checkpoint state (model, optimizer, etc) according to validation metrics. Requires specified logdir.
initial_seed – experiment’s initial seed value
state_kwargs – deprecated, use stage_kwargs instead

Raises

NotImplementedError – if both resume and CheckpointCallback already exist

SupervisedRunner ¶

class catalyst.runners.supervised.SupervisedRunner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, input_key: Any = 'features', output_key: Any = 'logits', input_target_key: str = 'targets', experiment_fn: Callable = <class 'catalyst.experiments.supervised.SupervisedExperiment'>)[source]¶

Bases: catalyst.runners.runner.Runner

Runner for experiments with supervised model.

__init__(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, input_key: Any = 'features', output_key: Any = 'logits', input_target_key: str = 'targets', experiment_fn: Callable = <class 'catalyst.experiments.supervised.SupervisedExperiment'>)[source]¶

Parameters

model – Torch model object
device – Torch device
input_key – Key in batch dict mapping for model input
output_key – Key in output dict model output will be stored under
input_target_key – Key in batch dict mapping for target
experiment_fn – callable function, which defines default experiment type to use during .train and .infer methods.

forward(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]¶

Forward method for your Runner. Should not be called directly outside of runner. If your model has specific interface, override this method to use it

Parameters

batch (Mapping[str, Any]) – dictionary with data batches from DataLoaders.
**kwargs – additional parameters to pass to the model

Returns

dict with model output batch

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]¶

Run model inference on specified data batch.

Warning

You should not override this method. If you need specific model call, override forward() method

Parameters

batch (Mapping[str, Any]) – dictionary with data batches from DataLoader.
**kwargs – additional kwargs to pass to the model

Returns

model output dictionary

Return type

Mapping[str, Any]

Runners¶

Runner¶

SupervisedRunner¶

Runner ¶

SupervisedRunner ¶