Shortcuts

Runners

class catalyst.runners.Runner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, **kwargs)[source]

Bases: catalyst.core.runner.IStageBasedRunner

Deep Learning Runner for supervised, unsupervised, gan, etc runs.

infer(*, model: torch.nn.modules.module.Module, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, logdir: str = None, resume: str = None, verbose: bool = False, stage_kwargs: Dict = None, fp16: Union[Dict, bool] = None, check: bool = False, timeit: bool = False, initial_seed: int = 42, state_kwargs: Dict = None) → None[source]

Starts the inference stage of the model.

Parameters
  • model – model for inference

  • datasets (OrderedDict[str, Union[Dataset, Dict, Any]]) – dictionary with one or several torch.utils.data.Dataset for training, validation or inference used for Loaders automatic creation preferred way for distributed training setup

  • loaders (OrderedDict[str, DataLoader]) – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference

  • callbacks (Union[List[Callback], OrderedDict[str, Callback]]) – list or dictionary with Catalyst callbacks

  • logdir – path to output directory

  • resume – path to checkpoint to use for resume

  • verbose – if True, it displays the status of the training to the console.

  • stage_kwargs – additional stage params

  • fp16 (Union[Dict, bool]) – If not None, then sets training to FP16. See https://nvidia.github.io/apex/amp.html#properties if fp16=True, params by default will be {"opt_level": "O1"}

  • check – if True, then only checks that pipeline is working (3 epochs only)

  • timeit – if True, computes the execution time of training process and displays it to the console.

  • initial_seed – experiment’s initial seed value

  • state_kwargs – deprecated, use stage_kwargs instead

Raises

NotImplementedError – if both resume and CheckpointCallback already exist

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Run model inference on specified data batch.

Parameters
  • batch (Mapping[str, Any]) – dictionary with data batches from DataLoader.

  • **kwargs – additional kwargs to pass to the model

# noqa: DAR202 :returns: model output dictionary :rtype: Mapping[str, Any]

Raises

NotImplementedError – if not implemented yet

predict_loader(*, loader: torch.utils.data.dataloader.DataLoader, model: torch.nn.modules.module.Module = None, resume: str = None, fp16: Union[Dict, bool] = None, initial_seed: int = 42) → Generator[source]

Runs model inference on PyTorch Dataloader and returns python generator with model predictions from runner.predict_batch. Cleans up the experiment info to avoid possible collisions. Sets is_train_loader and is_valid_loader to False while keeping is_infer_loader as True. Moves model to evaluation mode.

Parameters
  • loader – loader to predict

  • model – model to use for prediction

  • resume – path to checkpoint to resume

  • fp16 (Union[Dict, bool]) – fp16 usage flag

  • initial_seed – seed to use before prediction

Yields

bathes with model predictions

trace(*, model: torch.nn.modules.module.Module = None, batch: Any = None, logdir: str = None, loader: torch.utils.data.dataloader.DataLoader = None, method_name: str = 'forward', mode: str = 'eval', requires_grad: bool = False, fp16: Union[Dict, bool] = None, device: Union[str, torch.device] = 'cpu', predict_params: dict = None) → torch.jit.ScriptModule[source]

Traces model using Torch Jit.

Parameters
  • model – model to trace

  • batch – batch to forward through the model to trace

  • logdir (str, optional) – If specified, the result will be written to the directory

  • loader (DataLoader, optional) – if batch is not specified, the batch will be next(iter(loader))

  • method_name – model’s method name that will be traced

  • modetrain or eval

  • requires_grad – flag to trace with gradients

  • fp16 (Union[Dict, bool]) – If not None, then sets tracing params to FP16

  • device – Torch device or a string

  • predict_params – additional parameters for model forward

Returns

traced model

Return type

ScriptModule

Raises

ValueError – if batch and loader are Nones

train(*, model: torch.nn.modules.module.Module, criterion: torch.nn.modules.module.Module = None, optimizer: torch.optim.optimizer.Optimizer = None, scheduler: torch.optim.lr_scheduler._LRScheduler = None, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, logdir: str = None, resume: str = None, num_epochs: int = 1, valid_loader: str = 'valid', main_metric: str = 'loss', minimize_metric: bool = True, verbose: bool = False, stage_kwargs: Dict = None, checkpoint_data: Dict = None, fp16: Union[Dict, bool] = None, distributed: bool = False, check: bool = False, overfit: bool = False, timeit: bool = False, load_best_on_end: bool = False, initial_seed: int = 42, state_kwargs: Dict = None) → None[source]

Starts the train stage of the model.

Parameters
  • model – model to train

  • criterion – criterion function for training

  • optimizer – optimizer for training

  • scheduler – scheduler for training

  • datasets (OrderedDict[str, Union[Dataset, Dict, Any]]) – dictionary with one or several torch.utils.data.Dataset for training, validation or inference used for Loaders automatic creation preferred way for distributed training setup

  • loaders (OrderedDict[str, DataLoader]) – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference

  • callbacks (Union[List[Callback], OrderedDict[str, Callback]]) – list or dictionary with Catalyst callbacks

  • logdir – path to output directory

  • resume – path to checkpoint for model

  • num_epochs – number of training epochs

  • valid_loader – loader name used to calculate the metrics and save the checkpoints. For example, you can pass train and then the metrics will be taken from train loader.

  • main_metric – the key to the name of the metric by which the checkpoints will be selected.

  • minimize_metric – flag to indicate whether the main_metric should be minimized.

  • verbose – if True, it displays the status of the training to the console.

  • stage_kwargs – additional params for stage

  • checkpoint_data – additional data to save in checkpoint, for example: class_names, date_of_training, etc

  • fp16 (Union[Dict, bool]) – If not None, then sets training to FP16. See https://nvidia.github.io/apex/amp.html#properties if fp16=True, params by default will be {"opt_level": "O1"}

  • distributed – if True will start training in distributed mode. Note: Works only with python scripts. No jupyter support.

  • check – if True, then only checks that pipeline is working (3 epochs only with 3 batches per loader)

  • overfit – if True, then takes only one batch per loader for model overfitting, for advance usage please check BatchOverfitCallback

  • timeit – if True, computes the execution time of training process and displays it to the console.

  • load_best_on_end – if True, Runner will load best checkpoint state (model, optimizer, etc) according to validation metrics. Requires specified logdir.

  • initial_seed – experiment’s initial seed value

  • state_kwargs – deprecated, use stage_kwargs instead

Raises

NotImplementedError – if both resume and CheckpointCallback already exist

class catalyst.runners.SupervisedRunner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, input_key: Any = 'features', output_key: Any = 'logits', input_target_key: str = 'targets')[source]

Bases: catalyst.runners.runner.Runner

Runner for experiments with supervised model.

__init__(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, input_key: Any = 'features', output_key: Any = 'logits', input_target_key: str = 'targets')[source]
Parameters
  • model – Torch model object

  • device – Torch device

  • input_key – Key in batch dict mapping for model input

  • output_key – Key in output dict model output will be stored under

  • input_target_key – Key in batch dict mapping for target

forward(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Forward method for your Runner. Should not be called directly outside of runner. If your model has specific interface, override this method to use it

Parameters
  • batch (Mapping[str, Any]) – dictionary with data batches from DataLoaders.

  • **kwargs – additional parameters to pass to the model

Returns

dict with model output batch

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Run model inference on specified data batch.

Warning

You should not override this method. If you need specific model call, override forward() method

Parameters
  • batch (Mapping[str, Any]) – dictionary with data batches from DataLoader.

  • **kwargs – additional kwargs to pass to the model

Returns

model output dictionary

Return type

Mapping[str, Any]

Runner

class catalyst.runners.runner.Runner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, **kwargs)[source]

Bases: catalyst.core.runner.IStageBasedRunner

Deep Learning Runner for supervised, unsupervised, gan, etc runs.

infer(*, model: torch.nn.modules.module.Module, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, logdir: str = None, resume: str = None, verbose: bool = False, stage_kwargs: Dict = None, fp16: Union[Dict, bool] = None, check: bool = False, timeit: bool = False, initial_seed: int = 42, state_kwargs: Dict = None) → None[source]

Starts the inference stage of the model.

Parameters
  • model – model for inference

  • datasets (OrderedDict[str, Union[Dataset, Dict, Any]]) – dictionary with one or several torch.utils.data.Dataset for training, validation or inference used for Loaders automatic creation preferred way for distributed training setup

  • loaders (OrderedDict[str, DataLoader]) – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference

  • callbacks (Union[List[Callback], OrderedDict[str, Callback]]) – list or dictionary with Catalyst callbacks

  • logdir – path to output directory

  • resume – path to checkpoint to use for resume

  • verbose – if True, it displays the status of the training to the console.

  • stage_kwargs – additional stage params

  • fp16 (Union[Dict, bool]) – If not None, then sets training to FP16. See https://nvidia.github.io/apex/amp.html#properties if fp16=True, params by default will be {"opt_level": "O1"}

  • check – if True, then only checks that pipeline is working (3 epochs only)

  • timeit – if True, computes the execution time of training process and displays it to the console.

  • initial_seed – experiment’s initial seed value

  • state_kwargs – deprecated, use stage_kwargs instead

Raises

NotImplementedError – if both resume and CheckpointCallback already exist

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Run model inference on specified data batch.

Parameters
  • batch (Mapping[str, Any]) – dictionary with data batches from DataLoader.

  • **kwargs – additional kwargs to pass to the model

# noqa: DAR202 :returns: model output dictionary :rtype: Mapping[str, Any]

Raises

NotImplementedError – if not implemented yet

predict_loader(*, loader: torch.utils.data.dataloader.DataLoader, model: torch.nn.modules.module.Module = None, resume: str = None, fp16: Union[Dict, bool] = None, initial_seed: int = 42) → Generator[source]

Runs model inference on PyTorch Dataloader and returns python generator with model predictions from runner.predict_batch. Cleans up the experiment info to avoid possible collisions. Sets is_train_loader and is_valid_loader to False while keeping is_infer_loader as True. Moves model to evaluation mode.

Parameters
  • loader – loader to predict

  • model – model to use for prediction

  • resume – path to checkpoint to resume

  • fp16 (Union[Dict, bool]) – fp16 usage flag

  • initial_seed – seed to use before prediction

Yields

bathes with model predictions

trace(*, model: torch.nn.modules.module.Module = None, batch: Any = None, logdir: str = None, loader: torch.utils.data.dataloader.DataLoader = None, method_name: str = 'forward', mode: str = 'eval', requires_grad: bool = False, fp16: Union[Dict, bool] = None, device: Union[str, torch.device] = 'cpu', predict_params: dict = None) → torch.jit.ScriptModule[source]

Traces model using Torch Jit.

Parameters
  • model – model to trace

  • batch – batch to forward through the model to trace

  • logdir (str, optional) – If specified, the result will be written to the directory

  • loader (DataLoader, optional) – if batch is not specified, the batch will be next(iter(loader))

  • method_name – model’s method name that will be traced

  • modetrain or eval

  • requires_grad – flag to trace with gradients

  • fp16 (Union[Dict, bool]) – If not None, then sets tracing params to FP16

  • device – Torch device or a string

  • predict_params – additional parameters for model forward

Returns

traced model

Return type

ScriptModule

Raises

ValueError – if batch and loader are Nones

train(*, model: torch.nn.modules.module.Module, criterion: torch.nn.modules.module.Module = None, optimizer: torch.optim.optimizer.Optimizer = None, scheduler: torch.optim.lr_scheduler._LRScheduler = None, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, logdir: str = None, resume: str = None, num_epochs: int = 1, valid_loader: str = 'valid', main_metric: str = 'loss', minimize_metric: bool = True, verbose: bool = False, stage_kwargs: Dict = None, checkpoint_data: Dict = None, fp16: Union[Dict, bool] = None, distributed: bool = False, check: bool = False, overfit: bool = False, timeit: bool = False, load_best_on_end: bool = False, initial_seed: int = 42, state_kwargs: Dict = None) → None[source]

Starts the train stage of the model.

Parameters
  • model – model to train

  • criterion – criterion function for training

  • optimizer – optimizer for training

  • scheduler – scheduler for training

  • datasets (OrderedDict[str, Union[Dataset, Dict, Any]]) – dictionary with one or several torch.utils.data.Dataset for training, validation or inference used for Loaders automatic creation preferred way for distributed training setup

  • loaders (OrderedDict[str, DataLoader]) – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference

  • callbacks (Union[List[Callback], OrderedDict[str, Callback]]) – list or dictionary with Catalyst callbacks

  • logdir – path to output directory

  • resume – path to checkpoint for model

  • num_epochs – number of training epochs

  • valid_loader – loader name used to calculate the metrics and save the checkpoints. For example, you can pass train and then the metrics will be taken from train loader.

  • main_metric – the key to the name of the metric by which the checkpoints will be selected.

  • minimize_metric – flag to indicate whether the main_metric should be minimized.

  • verbose – if True, it displays the status of the training to the console.

  • stage_kwargs – additional params for stage

  • checkpoint_data – additional data to save in checkpoint, for example: class_names, date_of_training, etc

  • fp16 (Union[Dict, bool]) – If not None, then sets training to FP16. See https://nvidia.github.io/apex/amp.html#properties if fp16=True, params by default will be {"opt_level": "O1"}

  • distributed – if True will start training in distributed mode. Note: Works only with python scripts. No jupyter support.

  • check – if True, then only checks that pipeline is working (3 epochs only with 3 batches per loader)

  • overfit – if True, then takes only one batch per loader for model overfitting, for advance usage please check BatchOverfitCallback

  • timeit – if True, computes the execution time of training process and displays it to the console.

  • load_best_on_end – if True, Runner will load best checkpoint state (model, optimizer, etc) according to validation metrics. Requires specified logdir.

  • initial_seed – experiment’s initial seed value

  • state_kwargs – deprecated, use stage_kwargs instead

Raises

NotImplementedError – if both resume and CheckpointCallback already exist

SupervisedRunner

class catalyst.runners.supervised.SupervisedRunner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, input_key: Any = 'features', output_key: Any = 'logits', input_target_key: str = 'targets')[source]

Bases: catalyst.runners.runner.Runner

Runner for experiments with supervised model.

__init__(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, input_key: Any = 'features', output_key: Any = 'logits', input_target_key: str = 'targets')[source]
Parameters
  • model – Torch model object

  • device – Torch device

  • input_key – Key in batch dict mapping for model input

  • output_key – Key in output dict model output will be stored under

  • input_target_key – Key in batch dict mapping for target

forward(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Forward method for your Runner. Should not be called directly outside of runner. If your model has specific interface, override this method to use it

Parameters
  • batch (Mapping[str, Any]) – dictionary with data batches from DataLoaders.

  • **kwargs – additional parameters to pass to the model

Returns

dict with model output batch

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Run model inference on specified data batch.

Warning

You should not override this method. If you need specific model call, override forward() method

Parameters
  • batch (Mapping[str, Any]) – dictionary with data batches from DataLoader.

  • **kwargs – additional kwargs to pass to the model

Returns

model output dictionary

Return type

Mapping[str, Any]