Shortcuts

DL

Experiment

Experiment

class catalyst.dl.experiment.experiment.Experiment(model: torch.nn.modules.module.Module, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[OrderedDict[str, Callback], List[Callback]] = None, logdir: str = None, stage: str = 'train', criterion: torch.nn.modules.module.Module = None, optimizer: torch.optim.optimizer.Optimizer = None, scheduler: torch.optim.lr_scheduler._LRScheduler = None, num_epochs: int = 1, valid_loader: str = 'valid', main_metric: str = 'loss', minimize_metric: bool = True, verbose: bool = False, check_time: bool = False, check_run: bool = False, overfit: bool = False, stage_kwargs: Dict = None, checkpoint_data: Dict = None, distributed_params: Dict = None, initial_seed: int = 42)[source]

Bases: catalyst.core.experiment.IExperiment

Super-simple one-staged experiment, you can use to declare experiment in code.

__init__(model: torch.nn.modules.module.Module, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[OrderedDict[str, Callback], List[Callback]] = None, logdir: str = None, stage: str = 'train', criterion: torch.nn.modules.module.Module = None, optimizer: torch.optim.optimizer.Optimizer = None, scheduler: torch.optim.lr_scheduler._LRScheduler = None, num_epochs: int = 1, valid_loader: str = 'valid', main_metric: str = 'loss', minimize_metric: bool = True, verbose: bool = False, check_time: bool = False, check_run: bool = False, overfit: bool = False, stage_kwargs: Dict = None, checkpoint_data: Dict = None, distributed_params: Dict = None, initial_seed: int = 42)[source]
Parameters
  • model (Model) – model

  • datasets (OrderedDict[str, Union[Dataset, Dict, Any]]) – dictionary with one or several torch.utils.data.Dataset for training, validation or inference used for Loaders automatic creation preferred way for distributed training setup

  • loaders (OrderedDict[str, DataLoader]) – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference

  • callbacks (Union[List[Callback], OrderedDict[str, Callback]]) – list or dictionary with Catalyst callbacks

  • logdir (str) – path to output directory

  • stage (str) – current stage

  • criterion (Criterion) – criterion function

  • optimizer (Optimizer) – optimizer

  • scheduler (Scheduler) – scheduler

  • num_epochs (int) – number of experiment’s epochs

  • valid_loader (str) – loader name used to calculate the metrics and save the checkpoints. For example, you can pass train and then the metrics will be taken from train loader.

  • main_metric (str) – the key to the name of the metric by which the checkpoints will be selected.

  • minimize_metric (bool) – flag to indicate whether the main_metric should be minimized.

  • verbose (bool) – if True, it displays the status of the training to the console.

  • check_time (bool) – if True, computes the execution time of training process and displays it to the console.

  • check_run (bool) – if True, we run only 3 batches per loader and 3 epochs per stage to check pipeline correctness

  • overfit (bool) – if True, then takes only one batch per loader for model overfitting, for advance usage please check BatchOverfitCallback

  • stage_kwargs (dict) – additional stage params

  • checkpoint_data (dict) – additional data to save in checkpoint, for example: class_names, date_of_training, etc

  • distributed_params (dict) – dictionary with the parameters for distributed and FP16 method

  • initial_seed (int) – experiment’s initial seed value

property distributed_params

Dict with the parameters for distributed and FP16 method.

get_callbacks(stage: str) → OrderedDict[str, Callback][source]

Returns the callbacks for a given stage.

get_criterion(stage: str) → torch.nn.modules.module.Module[source]

Returns the criterion for a given stage.

get_loaders(stage: str, epoch: int = None) → OrderedDict[str, DataLoader][source]

Returns the loaders for a given stage.

get_model(stage: str) → torch.nn.modules.module.Module[source]

Returns the model for a given stage.

get_optimizer(stage: str, model: torch.nn.modules.module.Module) → torch.optim.optimizer.Optimizer[source]

Returns the optimizer for a given stage.

get_scheduler(stage: str, optimizer=None) → torch.optim.lr_scheduler._LRScheduler[source]

Returns the scheduler for a given stage.

get_stage_params(stage: str) → Mapping[str, Any][source]

Returns the state parameters for a given stage.

property hparams

Returns hyper parameters

property initial_seed

Experiment’s initial seed value.

property logdir

Path to the directory where the experiment logs.

property stages

Experiment’s stage names (array with one value).

ConfigExperiment

class catalyst.dl.experiment.config.ConfigExperiment(config: Dict)[source]

Bases: catalyst.core.experiment.IExperiment

Experiment created from a configuration file.

STAGE_KEYWORDS = ['criterion_params', 'optimizer_params', 'scheduler_params', 'data_params', 'transform_params', 'stage_params', 'callbacks_params']
__init__(config: Dict)[source]
Parameters

config (dict) – dictionary with parameters

property distributed_params

Dict with the parameters for distributed and FP16 methond.

get_callbacks(stage: str) → OrderedDict[Callback][source]

Returns the callbacks for a given stage.

get_criterion(stage: str) → torch.nn.modules.module.Module[source]

Returns the criterion for a given stage.

get_loaders(stage: str, epoch: int = None) → OrderedDict[str, DataLoader][source]

Returns the loaders for a given stage.

get_model(stage: str)[source]

Returns the model for a given stage.

get_optimizer(stage: str, model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]]) → Union[torch.optim.optimizer.Optimizer, Dict[str, torch.optim.optimizer.Optimizer]][source]

Returns the optimizer for a given stage.

Parameters
  • stage (str) – stage name

  • model (Union[Model, Dict[str, Model]]) – model or a dict of models

Returns

optimizer for selected stage

get_scheduler(stage: str, optimizer: torch.optim.optimizer.Optimizer) → torch.optim.lr_scheduler._LRScheduler[source]

Returns the scheduler for a given stage.

get_stage_params(stage: str) → Mapping[str, Any][source]

Returns the state parameters for a given stage.

get_transforms(stage: str = None, dataset: str = None) → Callable[source]

Returns transform for a given stage and dataset.

Parameters
  • stage (str) – stage name

  • dataset (str) – dataset name (e.g. “train”, “valid”), will be used only if the value of _key_value` is True

Returns

transform function

Return type

Callable

property hparams

Returns hyperparameters

property initial_seed

Experiment’s initial seed value.

property logdir

Path to the directory where the experiment logs.

property stages

Experiment’s stage names.

SupervisedExperiment

class catalyst.dl.experiment.supervised.SupervisedExperiment(model: torch.nn.modules.module.Module, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[OrderedDict[str, Callback], List[Callback]] = None, logdir: str = None, stage: str = 'train', criterion: torch.nn.modules.module.Module = None, optimizer: torch.optim.optimizer.Optimizer = None, scheduler: torch.optim.lr_scheduler._LRScheduler = None, num_epochs: int = 1, valid_loader: str = 'valid', main_metric: str = 'loss', minimize_metric: bool = True, verbose: bool = False, check_time: bool = False, check_run: bool = False, overfit: bool = False, stage_kwargs: Dict = None, checkpoint_data: Dict = None, distributed_params: Dict = None, initial_seed: int = 42)[source]

Bases: catalyst.dl.experiment.experiment.Experiment

Supervised experiment.

The main difference with Experiment that it will add several callbacks by default if you haven’t.

Here are list of callbacks by default:
CriterionCallback:

measures loss with specified criterion.

OptimizerCallback:

abstraction over optimizer step.

SchedulerCallback:

only in case if you provided scheduler to your experiment does lr_scheduler.step

CheckpointCallback:

saves model and optimizer state each epoch callback to save/restore your model/criterion/optimizer/metrics.

ConsoleLogger:

standard Catalyst logger, translates runner.*_metrics to console and text file

TensorboardLogger:

will write runner.*_metrics to tensorboard

RaiseExceptionCallback:

will raise exception if needed

get_callbacks(stage: str) → OrderedDict[str, Callback][source]

Override of BaseExperiment.get_callbacks method. Will add several of callbacks by default in case they missed.

Parameters

stage (str) – name of stage. It should start with infer if you don’t need default callbacks, as they required only for training stages.

Returns

Ordered dictionary of callbacks

for experiment

Return type

OrderedDict[str, Callback]

Runner

Runner

class catalyst.dl.runner.runner.Runner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, **kwargs)[source]

Bases: catalyst.core.runner.IStageBasedRunner

Deep Learning Runner for supervised, unsupervised, gan, etc runs.

infer(*, model: torch.nn.modules.module.Module, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, logdir: str = None, resume: str = None, verbose: bool = False, stage_kwargs: Dict = None, fp16: Union[Dict, bool] = None, check: bool = False, timeit: bool = False, initial_seed: int = 42, state_kwargs: Dict = None) → None[source]

Starts the inference stage of the model.

Parameters
  • model (Model) – model for inference

  • datasets (OrderedDict[str, Union[Dataset, Dict, Any]]) – dictionary with one or several torch.utils.data.Dataset for training, validation or inference used for Loaders automatic creation preferred way for distributed training setup

  • loaders (OrderedDict[str, DataLoader]) – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference

  • callbacks (Union[List[Callback], OrderedDict[str, Callback]]) – list or dictionary with Catalyst callbacks

  • logdir (str) – path to output directory

  • resume (str) – path to checkpoint to use for resume

  • verbose (bool) – if True, it displays the status of the training to the console.

  • stage_kwargs (dict) – additional stage params

  • fp16 (Union[Dict, bool]) – If not None, then sets training to FP16. See https://nvidia.github.io/apex/amp.html#properties if fp16=True, params by default will be {"opt_level": "O1"}

  • check (bool) – if True, then only checks that pipeline is working (3 epochs only)

  • timeit (bool) – if True, computes the execution time of training process and displays it to the console.

  • initial_seed (int) – experiment’s initial seed value

  • state_kwargs (dict) – deprecated, use stage_kwargs instead

Raises

NotImplementedError – if both resume and CheckpointCallback already exist

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Run model inference on specified data batch.

Parameters
  • batch (Mapping[str, Any]) – dictionary with data batches from DataLoader.

  • **kwargs – additional kwargs to pass to the model

# noqa: DAR202 :returns: model output dictionary :rtype: Mapping[str, Any]

Raises

NotImplementedError – if not implemented yet

predict_loader(*, loader: torch.utils.data.dataloader.DataLoader, model: torch.nn.modules.module.Module = None, resume: str = None, fp16: Union[Dict, bool] = None, initial_seed: int = 42) → Generator[source]

Runs model inference on PyTorch Dataloader and returns python generator with model predictions from runner.predict_batch. Cleans up the experiment info to avoid possible collisions. Sets is_train_loader and is_valid_loader to False while keeping is_infer_loader as True. Moves model to evaluation mode.

Parameters
  • loader (DataLoader) – loader to predict

  • model (Model) – model to use for prediction

  • resume (str) – path to checkpoint to resume

  • fp16 (Union[Dict, bool]) – fp16 usage flag

  • initial_seed (int) – seed to use before prediction

Yields

bathes with model predictions

trace(*, model: torch.nn.modules.module.Module = None, batch: Any = None, logdir: str = None, loader: torch.utils.data.dataloader.DataLoader = None, method_name: str = 'forward', mode: str = 'eval', requires_grad: bool = False, fp16: Union[Dict, bool] = None, device: Union[str, torch.device] = 'cpu', predict_params: dict = None) → torch.jit.ScriptModule[source]

Traces model using Torch Jit.

Parameters
  • model (Model) – model to trace

  • batch – batch to forward through the model to trace

  • logdir (str, optional) – If specified, the result will be written to the directory

  • loader (DataLoader, optional) – if batch is not specified, the batch will be next(iter(loader))

  • method_name (str) – model’s method name that will be traced

  • mode (str) – train or eval

  • requires_grad (bool) – flag to trace with gradients

  • fp16 (Union[Dict, bool]) – If not None, then sets tracing params to FP16

  • device (Device) – Torch device or a string

  • predict_params (dict) – additional parameters for model forward

Returns

traced model

Return type

ScriptModule

Raises

ValueError – if batch and loader are Nones

train(*, model: torch.nn.modules.module.Module, criterion: torch.nn.modules.module.Module = None, optimizer: torch.optim.optimizer.Optimizer = None, scheduler: torch.optim.lr_scheduler._LRScheduler = None, datasets: OrderedDict[str, Union[Dataset, Dict, Any]] = None, loaders: OrderedDict[str, DataLoader] = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, logdir: str = None, resume: str = None, num_epochs: int = 1, valid_loader: str = 'valid', main_metric: str = 'loss', minimize_metric: bool = True, verbose: bool = False, stage_kwargs: Dict = None, checkpoint_data: Dict = None, fp16: Union[Dict, bool] = None, distributed: bool = False, check: bool = False, overfit: bool = False, timeit: bool = False, load_best_on_end: bool = False, initial_seed: int = 42, state_kwargs: Dict = None) → None[source]

Starts the train stage of the model.

Parameters
  • model (Model) – model to train

  • criterion (Criterion) – criterion function for training

  • optimizer (Optimizer) – optimizer for training

  • scheduler (Scheduler) – scheduler for training

  • datasets (OrderedDict[str, Union[Dataset, Dict, Any]]) – dictionary with one or several torch.utils.data.Dataset for training, validation or inference used for Loaders automatic creation preferred way for distributed training setup

  • loaders (OrderedDict[str, DataLoader]) – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference

  • callbacks (Union[List[Callback], OrderedDict[str, Callback]]) – list or dictionary with Catalyst callbacks

  • logdir (str) – path to output directory

  • resume (str) – path to checkpoint for model

  • num_epochs (int) – number of training epochs

  • valid_loader (str) – loader name used to calculate the metrics and save the checkpoints. For example, you can pass train and then the metrics will be taken from train loader.

  • main_metric (str) – the key to the name of the metric by which the checkpoints will be selected.

  • minimize_metric (bool) – flag to indicate whether the main_metric should be minimized.

  • verbose (bool) – if True, it displays the status of the training to the console.

  • stage_kwargs (dict) – additional params for stage

  • checkpoint_data (dict) – additional data to save in checkpoint, for example: class_names, date_of_training, etc

  • fp16 (Union[Dict, bool]) – If not None, then sets training to FP16. See https://nvidia.github.io/apex/amp.html#properties if fp16=True, params by default will be {"opt_level": "O1"}

  • distributed (bool) – if True will start training in distributed mode. Note: Works only with python scripts. No jupyter support.

  • check (bool) – if True, then only checks that pipeline is working (3 epochs only with 3 batches per loader)

  • overfit (bool) – if True, then takes only one batch per loader for model overfitting, for advance usage please check BatchOverfitCallback

  • timeit (bool) – if True, computes the execution time of training process and displays it to the console.

  • load_best_on_end (bool) – if True, Runner will load best checkpoint state (model, optimizer, etc) according to validation metrics. Requires specified logdir.

  • initial_seed (int) – experiment’s initial seed value

  • state_kwargs (dict) – deprecated, use stage_kwargs instead

Raises

NotImplementedError – if both resume and CheckpointCallback already exist

SupervisedRunner

class catalyst.dl.runner.supervised.SupervisedRunner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, input_key: Any = 'features', output_key: Any = 'logits', input_target_key: str = 'targets')[source]

Bases: catalyst.dl.runner.runner.Runner

Runner for experiments with supervised model.

__init__(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, input_key: Any = 'features', output_key: Any = 'logits', input_target_key: str = 'targets')[source]
Parameters
  • model (RunnerModel) – Torch model object

  • device (Device) – Torch device

  • input_key (Any) – Key in batch dict mapping for model input

  • output_key (Any) – Key in output dict model output will be stored under

  • input_target_key (str) – Key in batch dict mapping for target

forward(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Forward method for your Runner. Should not be called directly outside of runner. If your model has specific interface, override this method to use it

Parameters
  • batch (Mapping[str, Any]) – dictionary with data batches from DataLoaders.

  • **kwargs – additional parameters to pass to the model

Returns

dict with model output batch

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Run model inference on specified data batch.

Warning

You should not override this method. If you need specific model call, override forward() method

Parameters
  • batch (Mapping[str, Any]) – dictionary with data batches from DataLoader.

  • **kwargs – additional kwargs to pass to the model

Returns

model output dictionary

Return type

Mapping[str, Any]

Callbacks

ConfusionMatrixCallback

class catalyst.dl.callbacks.confusion_matrix.ConfusionMatrixCallback(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'confusion_matrix', version: str = 'tnt', class_names: List[str] = None, num_classes: int = None, plot_params: Dict = None, tensorboard_callback_name: str = '_tensorboard')[source]

Bases: catalyst.core.callback.Callback

@TODO: Docs. Contribution is welcome.

__init__(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'confusion_matrix', version: str = 'tnt', class_names: List[str] = None, num_classes: int = None, plot_params: Dict = None, tensorboard_callback_name: str = '_tensorboard')[source]
Parameters

@TODO – Docs. Contribution is welcome

on_batch_end(runner: catalyst.core.runner.IRunner)[source]

Batch end hook.

Parameters

runner (IRunner) – current runner

on_loader_end(runner: catalyst.core.runner.IRunner)[source]

Loader end hook.

Parameters

runner (IRunner) – current runner

on_loader_start(runner: catalyst.core.runner.IRunner)[source]

Loader start hook.

Parameters

runner (IRunner) – current runner

InferCallback

class catalyst.dl.callbacks.inference.InferCallback(out_dir=None, out_prefix=None)[source]

Bases: catalyst.core.callback.Callback

@TODO: Docs. Contribution is welcome.

__init__(out_dir=None, out_prefix=None)[source]
Parameters

@TODO – Docs. Contribution is welcome

on_batch_end(runner: catalyst.core.runner.IRunner)[source]

Batch end hook.

Parameters

runner (IRunner) – current runner

on_loader_end(runner: catalyst.core.runner.IRunner)[source]

Loader end hook.

Parameters

runner (IRunner) – current runner

on_loader_start(runner: catalyst.core.runner.IRunner)[source]

Loader start hook.

Parameters

runner (IRunner) – current runner

on_stage_start(runner: catalyst.core.runner.IRunner)[source]

Stage start hook.

Parameters

runner (IRunner) – current runner

MeterMetricsCallback

class catalyst.dl.callbacks.meter.MeterMetricsCallback(metric_names: List[str], meter_list: List, input_key: str = 'targets', output_key: str = 'logits', class_names: List[str] = None, num_classes: int = 2, activation: str = 'Sigmoid')[source]

Bases: catalyst.core.callback.Callback

A callback that tracks metrics through meters and prints metrics for each class on runner.on_loader_end.

Note

This callback works for both single metric and multi-metric meters.

__init__(metric_names: List[str], meter_list: List, input_key: str = 'targets', output_key: str = 'logits', class_names: List[str] = None, num_classes: int = 2, activation: str = 'Sigmoid')[source]
Parameters
  • metric_names (List[str]) – of metrics to print Make sure that they are in the same order that metrics are outputted by the meters in meter_list

  • meter_list (list-like) – List of meters.meter.Meter instances len(meter_list) == num_classes

  • input_key (str) – input key to use for metric calculation specifies our y_true.

  • output_key (str) – output key to use for metric calculation; specifies our y_pred

  • class_names (List[str]) – class names to display in the logs. If None, defaults to indices for each class, starting from 0.

  • num_classes (int) – Number of classes; must be > 1

  • activation (str) – An torch.nn activation applied to the logits. Must be one of [‘none’, ‘Sigmoid’, ‘Softmax2d’]

on_batch_end(runner: catalyst.core.runner.IRunner)[source]

Batch end hook. Computes batch metrics.

Parameters

runner (IRunner) – current runner

on_loader_end(runner: catalyst.core.runner.IRunner)[source]

Loader end hook. Computes loader metrics.

Parameters

runner (IRunner) – current runner

on_loader_start(runner: catalyst.core.runner.IRunner)[source]

Loader start hook.

Parameters

runner (IRunner) – current runner

MixupCallback

class catalyst.dl.callbacks.mixup.MixupCallback(input_key: str = 'targets', output_key: str = 'logits', fields: List[str] = ('features', ), alpha=1.0, on_train_only=True, **kwargs)[source]

Bases: catalyst.core.callbacks.criterion.CriterionCallback

Callback to do mixup augmentation.

More details about mixin can be found in the paper mixup: Beyond Empirical Risk Minimization.

Warning

catalyst.dl.callbacks.MixupCallback is inherited from catalyst.dl.CriterionCallback and does its work. You may not use them together.

__init__(input_key: str = 'targets', output_key: str = 'logits', fields: List[str] = ('features', ), alpha=1.0, on_train_only=True, **kwargs)[source]
Parameters
  • fields (List[str]) – list of features which must be affected.

  • alpha (float) – beta distribution a=b parameters. Must be >=0. The more alpha closer to zero the less effect of the mixup.

  • on_train_only (bool) – Apply to train only. As the mixup use the proxy inputs, the targets are also proxy. We are not interested in them, are we? So, if on_train_only is True, use a standard output/metric for validation.

on_batch_start(runner: catalyst.core.runner.IRunner)[source]

Batch start hook.

Parameters

runner (IRunner) – current runner

on_loader_start(runner: catalyst.core.runner.IRunner)[source]

Loader start hook.

Parameters

runner (IRunner) – current runner

LRFinder

class catalyst.dl.callbacks.scheduler.LRFinder(final_lr, scale: str = 'log', num_steps: Optional[int] = None, optimizer_key: str = None)[source]

Bases: catalyst.core.callbacks.scheduler.LRUpdater

Helps you find an optimal learning rate for a model, as per suggestion of Cyclical Learning Rates for Training Neural Networks paper. Learning rate is increased in linear or log scale, depending on user input.

See How Do You Find A Good Learning Rate article for details.

__init__(final_lr, scale: str = 'log', num_steps: Optional[int] = None, optimizer_key: str = None)[source]
Parameters
  • final_lr – final learning rate to try with

  • scale (str) – learning rate increasing scale (“log” or “linear”)

  • num_steps (Optional[int]) – number of batches to try; if None - whole loader would be used.

  • optimizer_key (str) – which optimizer key to use for learning rate scheduling

calc_lr()[source]

Calculates learning reate.

Returns

learning rate.

calc_momentum()[source]

@TODO: Docs. Contribution is welcome.

on_batch_end(runner: catalyst.core.runner.IRunner)[source]

@TODO: Docs. Contribution is welcome.

Parameters

runner (IRunner) – current runner

Raises

NotImplementedError – at the end of LRFinder

on_loader_start(runner: catalyst.core.runner.IRunner)[source]

@TODO: Docs. Contribution is welcome.

Parameters

runner (IRunner) – current runner

TracerCallback

class catalyst.dl.callbacks.tracing.TracerCallback(metric: str = 'loss', minimize: bool = True, min_delta: float = 1e-06, mode: str = 'best', do_once: bool = True, method_name: str = 'forward', requires_grad: bool = False, opt_level: str = None, trace_mode: str = 'eval', out_dir: Union[str, pathlib.Path] = None, out_model: Union[str, pathlib.Path] = None)[source]

Bases: catalyst.core.callback.Callback

Traces model during training if metric provided is improved.

__init__(metric: str = 'loss', minimize: bool = True, min_delta: float = 1e-06, mode: str = 'best', do_once: bool = True, method_name: str = 'forward', requires_grad: bool = False, opt_level: str = None, trace_mode: str = 'eval', out_dir: Union[str, pathlib.Path] = None, out_model: Union[str, pathlib.Path] = None)[source]
Parameters
  • metric (str) – Metric key we should trace model based on

  • minimize (bool) – Whether do we minimize metric or not

  • min_delta (float) – Minimum value of change for metric to be considered as improved

  • mode (str) – One of best or last

  • do_once (str) – Whether do we trace once per stage or every epoch

  • method_name (str) – Model’s method name that will be used as entrypoint during tracing

  • requires_grad (bool) – Flag to use grads

  • opt_level (str) – AMP FP16 init level

  • trace_mode (str) – Mode for model to trace (train or eval)

  • out_dir (Union[str, Path]) – Directory to save model to

  • out_model (Union[str, Path]) – Path to save model to (overrides out_dir argument)

on_epoch_end(runner: catalyst.core.runner.IRunner)[source]

Performing model tracing on epoch end if condition metric is improved

Parameters

runner (IRunner) – Current runner

on_stage_end(runner: catalyst.core.runner.IRunner)[source]

Performing model tracing on stage end if do_once is True.

Parameters

runner (IRunner) – Current runner

Metrics

Accuracy

class catalyst.dl.callbacks.metrics.accuracy.AccuracyCallback(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'accuracy', multiplier: float = 1.0, topk_args: List[int] = None, num_classes: int = None, accuracy_args: List[int] = None, **kwargs)[source]

Bases: catalyst.core.callbacks.metrics.BatchMetricCallback

Accuracy metric callback. Computes multi-class accuracy@topk for the specified values of topk.

Note

For multi-label accuracy please use catalyst.dl.callbacks.metrics.MultiLabelAccuracyCallback

__init__(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'accuracy', multiplier: float = 1.0, topk_args: List[int] = None, num_classes: int = None, accuracy_args: List[int] = None, **kwargs)[source]
Parameters
  • input_key (str) – input key to use for accuracy calculation; specifies our y_true

  • output_key (str) – output key to use for accuracy calculation; specifies our y_pred

  • prefix (str) – key for the metric’s name

  • topk_args (List[int]) – specifies which accuracy@K to log: [1] - accuracy [1, 3] - accuracy at 1 and 3 [1, 3, 5] - accuracy at 1, 3 and 5

  • num_classes (int) – number of classes to calculate topk_args if accuracy_args is None

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of "none", "Sigmoid", or "Softmax"

class catalyst.dl.callbacks.metrics.accuracy.MultiLabelAccuracyCallback(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'multi_label_accuracy', threshold: float = None, activation: str = 'Sigmoid')[source]

Bases: catalyst.core.callbacks.metrics.BatchMetricCallback

Accuracy metric callback. Computes multi-class accuracy@topk for the specified values of topk.

Note

For multi-label accuracy please use catalyst.dl.callbacks.metrics.MultiLabelAccuracyCallback

__init__(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'multi_label_accuracy', threshold: float = None, activation: str = 'Sigmoid')[source]
Parameters
  • input_key (str) – input key to use for accuracy calculation; specifies our y_true

  • output_key (str) – output key to use for accuracy calculation; specifies our y_pred

  • prefix (str) – key for the metric’s name

  • threshold (float) – threshold for for model output

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of "none", "Sigmoid", or "Softmax"

AUC

class catalyst.dl.callbacks.metrics.auc.AUCCallback(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'auc', multiplier: float = 1.0, class_args: List[str] = None, **kwargs)[source]

Bases: catalyst.core.callbacks.metrics.LoaderMetricCallback

Calculates the AUC per class for each loader.

Note

Currently, supports binary and multi-label cases.

__init__(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'auc', multiplier: float = 1.0, class_args: List[str] = None, **kwargs)[source]
Parameters
  • input_key (str) – input key to use for auc calculation specifies our y_true.

  • output_key (str) – output key to use for auc calculation; specifies our y_pred.

  • prefix (str) – metric’s name.

  • multiplier (float) – scale factor for the metric.

  • class_args (List[str]) – class names to display in the logs. If None, defaults to indices for each class, starting from 0

CMC score

class catalyst.dl.callbacks.metrics.cmc_score.CMCScoreCallback(embeddings_key: str = 'logits', labels_key: str = 'targets', is_query_key: str = 'is_query', prefix: str = 'cmc', topk_args: List[int] = None, num_classes: int = None)[source]

Bases: catalyst.core.callback.Callback

Cumulative Matching Characteristics callback.

Note

You should use it with ControlFlowCallback and add all query/gallery sets to loaders. Loaders should contain “is_query” and “label” key.

An usage example can be found in Readme.md under “CV - MNIST with Metric Learning”.

__init__(embeddings_key: str = 'logits', labels_key: str = 'targets', is_query_key: str = 'is_query', prefix: str = 'cmc', topk_args: List[int] = None, num_classes: int = None)[source]

This callback was designed to count cumulative matching characteristics. If current object is from query your dataset should output True in is_query_key and false if current object is from gallery. You can see QueryGalleryDataset in catalyst.contrib.datasets.metric_learning for more information. On batch end callback accumulate all embeddings

Parameters
  • embeddings_key (str) – embeddings key in output dict

  • labels_key (str) – labels key in output dict

  • is_query_key (str) – bool key True if current object is from query

  • prefix (str) – key for the metric’s name

  • topk_args (List[int]) – specifies which cmc@K to log. [1] - cmc@1 [1, 3] - cmc@1 and cmc@3 [1, 3, 5] - cmc@1, cmc@3 and cmc@5

  • num_classes (int) – number of classes to calculate accuracy_args if topk_args is None

on_batch_end(runner: catalyst.core.runner.IRunner)[source]

On batch end action

on_loader_end(runner: catalyst.core.runner.IRunner)[source]

On loader end action

on_loader_start(runner: catalyst.core.runner.IRunner)[source]

On loader start action

Dice

class catalyst.dl.callbacks.metrics.dice.DiceCallback(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'dice', eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]

Bases: catalyst.core.callbacks.metrics.BatchMetricCallback

Dice metric callback.

__init__(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'dice', eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]
Parameters
  • input_key (str) – input key to use for dice calculation; specifies our y_true

  • output_key (str) – output key to use for dice calculation; specifies our y_pred

class catalyst.dl.callbacks.metrics.dice.MultiClassDiceMetricCallback(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'dice', class_names=None)[source]

Bases: catalyst.core.callback.Callback

Global Multi-Class Dice Metric Callback: calculates the exact dice score across multiple batches. This callback is good for getting the dice score with small batch sizes where the batchwise dice is noisier.

__init__(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'dice', class_names=None)[source]
Parameters
  • input_key (str) – input key to use for dice calculation; specifies our y_true

  • output_key (str) – output key to use for dice calculation; specifies our y_pred

  • prefix (str) – prefix for printing the metric

  • class_names (dict/List) – if dictionary, should be: {class_id: class_name, …} where class_id is an integer This allows you to ignore class indices. if list, make sure it corresponds to the number of classes

on_batch_end(runner: catalyst.core.runner.IRunner)[source]

Records the confusion matrix at the end of each batch.

Parameters

runner (IRunner) – current runner

on_loader_end(runner: catalyst.core.runner.IRunner)[source]

@TODO: Docs. Contribution is welcome.

Parameters

runner (IRunner) – current runner

catalyst.dl.callbacks.metrics.dice.MulticlassDiceMetricCallback

alias of catalyst.dl.callbacks.metrics.dice.MultiClassDiceMetricCallback

F1 score

class catalyst.dl.callbacks.metrics.f1_score.F1ScoreCallback(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'f1_score', beta: float = 1.0, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]

Bases: catalyst.core.callbacks.metrics.BatchMetricCallback

F1 score metric callback.

__init__(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'f1_score', beta: float = 1.0, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]
Parameters
  • input_key (str) – input key to use for iou calculation specifies our y_true

  • output_key (str) – output key to use for iou calculation; specifies our y_pred

  • prefix (str) – key to store in logs

  • beta (float) – beta param for f_score

  • eps (float) – epsilon to avoid zero division

  • threshold (float) – threshold for outputs binarization

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of 'none', 'Sigmoid', or 'Softmax2d'

IOU

class catalyst.dl.callbacks.metrics.iou.IouCallback(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'iou', eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]

Bases: catalyst.core.callbacks.metrics.BatchMetricCallback

IoU (Jaccard) metric callback.

__init__(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'iou', eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]
Parameters
  • input_key (str) – input key to use for iou calculation specifies our y_true

  • output_key (str) – output key to use for iou calculation; specifies our y_pred

  • prefix (str) – key to store in logs

  • eps (float) – epsilon to avoid zero division

  • threshold (float) – threshold for outputs binarization

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of 'none', 'Sigmoid', 'Softmax2d'

catalyst.dl.callbacks.metrics.iou.JaccardCallback

alias of catalyst.dl.callbacks.metrics.iou.IouCallback

class catalyst.dl.callbacks.metrics.iou.ClasswiseIouCallback(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'iou', classes: List[str] = None, num_classes: int = None, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]

Bases: catalyst.core.callbacks.metrics.BatchMetricCallback

Classwise IoU (Jaccard) metric callback.

__init__(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'iou', classes: List[str] = None, num_classes: int = None, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]
Parameters
  • input_key (str) – input key to use for iou calculation specifies our y_true

  • output_key (str) – output key to use for iou calculation; specifies our y_pred

  • prefix (str) – key to store in logs (will be prefix_class_name)

  • classes (List[str]) – list of class names You should specify either ‘classes’ or ‘num_classes’

  • num_classes (int) – number of classes You should specify either ‘classes’ or ‘num_classes’

  • eps (float) – epsilon to avoid zero division

  • threshold (float) – threshold for outputs binarization

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of 'none', 'Sigmoid', 'Softmax2d'

catalyst.dl.callbacks.metrics.iou.ClasswiseJaccardCallback

alias of catalyst.dl.callbacks.metrics.iou.ClasswiseIouCallback

Global precision, recall and F1-score

class catalyst.dl.callbacks.metrics.ppv_tpr_f1.PrecisionRecallF1ScoreCallback(input_key: str = 'targets', output_key: str = 'logits', class_names: List[str] = None, num_classes: int = 2, threshold: float = 0.5, activation: str = 'Sigmoid')[source]

Bases: catalyst.dl.callbacks.meter.MeterMetricsCallback

Calculates the global precision (positive predictive value or ppv), recall (true positive rate or tpr), and F1-score per class for each loader.

Note

Currently, supports binary and multi-label cases.

__init__(input_key: str = 'targets', output_key: str = 'logits', class_names: List[str] = None, num_classes: int = 2, threshold: float = 0.5, activation: str = 'Sigmoid')[source]
Parameters
  • input_key (str) – input key to use for metric calculation specifies our y_true

  • output_key (str) – output key to use for metric calculation; specifies our y_pred

  • class_names (List[str]) – class names to display in the logs. If None, defaults to indices for each class, starting from 0.

  • num_classes (int) – Number of classes; must be > 1

  • threshold (float) – threshold for outputs binarization

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of 'none', 'Sigmoid', 'Softmax2d'

Precision

class catalyst.dl.callbacks.metrics.precision.AveragePrecisionCallback(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'average_precision', multiplier: float = 1.0, class_args: List[str] = None, **kwargs)[source]

Bases: catalyst.core.callbacks.metrics.LoaderMetricCallback

AveragePrecision metric callback.

__init__(input_key: str = 'targets', output_key: str = 'logits', prefix: str = 'average_precision', multiplier: float = 1.0, class_args: List[str] = None, **kwargs)[source]
Parameters
  • input_key (str) – input key to use for calculation mean average precision; specifies our y_true.

  • output_key (str) – output key to use for calculation mean average precision; specifies our y_pred.

  • prefix (str) – metric’s name.

  • multiplier (float) – scale factor for the metric.

  • class_args (List[str]) – class names to display in the logs. If None, defaults to indices for each class, starting from 0

Utils

Torch

catalyst.dl.utils.torch.get_loader(data_source: Iterable[dict], open_fn: Callable, dict_transform: Callable = None, sampler=None, collate_fn: Callable = <function default_collate>, batch_size: int = 32, num_workers: int = 4, shuffle: bool = False, drop_last: bool = False)[source]

Creates a DataLoader from given source and its open/transform params.

Parameters
  • data_source (Iterable[dict]) – and iterable containing your data annotations, (for example path to images, labels, bboxes, etc)

  • open_fn (Callable) – function, that can open your annotations dict and transfer it to data, needed by your network (for example open image by path, or tokenize read string)

  • dict_transform (callable) – transforms to use on dict (for example normalize image, add blur, crop/resize/etc)

  • sampler (Sampler, optional) – defines the strategy to draw samples from the dataset

  • collate_fn (callable, optional) – merges a list of samples to form a mini-batch of Tensor(s). Used when using batched loading from a map-style dataset

  • batch_size (int, optional) – how many samples per batch to load

  • num_workers (int, optional) – how many subprocesses to use for data loading. 0 means that the data will be loaded in the main process

  • shuffle (bool, optional) – set to True to have the data reshuffled at every epoch (default: False).

  • drop_last (bool, optional) – set to True to drop the last incomplete batch, if the dataset size is not divisible by the batch size. If False and the size of dataset is not divisible by the batch size, then the last batch will be smaller. (default: False)

Returns

DataLoader with catalyst.data.ListDataset

Trace

catalyst.dl.utils.trace.trace_model(model: torch.nn.modules.module.Module, predict_fn: Callable, batch=None, method_name: str = 'forward', mode: str = 'eval', requires_grad: bool = False, opt_level: str = None, device: Union[str, torch.device] = 'cpu', predict_params: dict = None) → torch.jit.ScriptModule[source]

Traces model using runner and batch.

Parameters
  • model – Model to trace

  • predict_fn – Function to run prediction with the model provided, takes model, inputs parameters

  • batch – Batch to trace the model

  • method_name (str) – Model’s method name that will be used as entrypoint during tracing

  • mode (str) – Mode for model to trace (train or eval)

  • requires_grad (bool) – Flag to use grads

  • opt_level (str) – Apex FP16 init level, optional

  • device (str) – Torch device

  • predict_params (dict) – additional parameters for model forward

Returns

Traced model

Return type

jit.ScriptModule

Raises

ValueError – if both batch and predict_fn must be specified or mode is not in ‘eval’ or ‘train’.

catalyst.dl.utils.trace.trace_model_from_checkpoint(logdir: pathlib.Path, method_name: str, checkpoint_name: str, stage: str = None, loader: Union[str, int] = None, mode: str = 'eval', requires_grad: bool = False, opt_level: str = None, device: Union[str, torch.device] = 'cpu')[source]

Traces model using created experiment and runner.

Parameters
  • logdir (Union[str, Path]) – Path to Catalyst logdir with model

  • checkpoint_name (str) – Name of model checkpoint to use

  • stage (str) – experiment’s stage name

  • loader (Union[str, int]) – experiment’s loader name or its index

  • method_name (str) – Model’s method name that will be used as entrypoint during tracing

  • mode (str) – Mode for model to trace (train or eval)

  • requires_grad (bool) – Flag to use grads

  • opt_level (str) – AMP FP16 init level

  • device (str) – Torch device

Returns

the traced model

catalyst.dl.utils.trace.trace_model_from_runner(runner: catalyst.core.runner.IRunner, checkpoint_name: str = None, method_name: str = 'forward', mode: str = 'eval', requires_grad: bool = False, opt_level: str = None, device: Union[str, torch.device] = 'cpu') → torch.jit.ScriptModule[source]

Traces model using created experiment and runner.

Parameters
  • runner (Runner) – Current runner.

  • checkpoint_name (str) – Name of model checkpoint to use, if None traces current model from runner

  • method_name (str) – Model’s method name that will be used as entrypoint during tracing

  • mode (str) – Mode for model to trace (train or eval)

  • requires_grad (bool) – Flag to use grads

  • opt_level (str) – AMP FP16 init level

  • device (str) – Torch device

Returns

Traced model

Return type

ScriptModule

catalyst.dl.utils.trace.get_trace_name(method_name: str, mode: str = 'eval', requires_grad: bool = False, opt_level: str = None, additional_string: str = None) → str[source]

Creates a file name for the traced model.

Parameters
  • method_name (str) – model’s method name

  • mode (str) – train or eval

  • requires_grad (bool) – flag if model was traced with gradients

  • opt_level (str) – opt_level if model was traced in FP16

  • additional_string (str) – any additional information

Returns

Filename for traced model to be saved.

Return type

str

catalyst.dl.utils.trace.save_traced_model(model: torch.jit.ScriptModule, logdir: Union[str, pathlib.Path] = None, method_name: str = 'forward', mode: str = 'eval', requires_grad: bool = False, opt_level: str = None, out_dir: Union[str, pathlib.Path] = None, out_model: Union[str, pathlib.Path] = None, checkpoint_name: str = None) → None[source]

Saves traced model.

Parameters
  • model (ScriptModule) – Traced model

  • logdir (Union[str, Path]) – Path to experiment

  • method_name (str) – Name of the method was traced

  • mode (str) – Model’s mode - train or eval

  • requires_grad (bool) – Whether model was traced with require_grad or not

  • opt_level (str) – Apex FP16 init level used during tracing

  • out_dir (Union[str, Path]) – Directory to save model to (overrides logdir)

  • out_model (Union[str, Path]) – Path to save model to (overrides logdir & out_dir)

  • checkpoint_name (str) – Checkpoint name used to restore the model

Raises

ValueError – if nothing out of logdir, out_dir or out_model is specified.

catalyst.dl.utils.trace.load_traced_model(model_path: Union[str, pathlib.Path], device: Union[str, torch.device] = 'cpu', opt_level: str = None) → torch.jit.ScriptModule[source]

Loads a traced model.

Parameters
  • model_path – Path to traced model

  • device (str) – Torch device

  • opt_level (str) – Apex FP16 init level, optional

Returns

Traced model

Return type

ScriptModule

Wizard

catalyst.dl.utils.wizard.run_wizard()[source]

Method to initialize and run wizard.

class catalyst.dl.utils.wizard.Wizard[source]

Bases: object

Class for Catalyst Config API Wizard.

The instance of this class will be created and called from cli command: catalyst-dl init --interactive.

With help of this Wizard user will be able to setup pipeline from available templates and make choices of what predefined classes to use in different parts of pipeline.

__init__()[source]

Initialization of instance of this class will print welcome message and logo of Catalyst in ASCII format. Also here we’ll save all classes of Catalyst own pipeline parts to be able to put user’s modules on top of lists to ease the choice.

run()[source]

Walks user through predefined wizard steps.