Core¶

Core ¶

Experiment ¶

class catalyst.core.experiment.IExperiment[source]¶

Bases: abc.ABC

An abstraction that contains information about the experiment – a model, a criterion, an optimizer, a scheduler, and their hyperparameters. It also contains information about the data and transformations used. In general, the Experiment knows what you would like to run.

Note

To learn more about Catalyst Core concepts, please check out

catalyst.core.experiment.IExperiment

catalyst.core.runner.IRunner

catalyst.core.callback.Callback

Abstraction, please check out the implementations:

catalyst.dl.experiment.base.BaseExperiment

catalyst.dl.experiment.config.ConfigExperiment

catalyst.dl.experiment.supervised.SupervisedExperiment

abstract property distributed_params¶

Dictionary with the parameters for distributed and half-precision training.

Used in catalyst.utils.distributed.process_components to setup Nvidia Apex or PyTorch distributed.

Example:

>>> experiment.distributed_params
{"opt_level": "O1", "syncbn": True}  # Apex variant

abstract get_callbacks(stage: str) → OrderedDict[str, Callback][source]¶

Returns callbacks for a given stage.

Note

To learn more about Catalyst Callbacks mechanism, please follow catalyst.core.callback.Callback documentation.

Note

We need ordered dictionary to guarantee the correct dataflow and order of metrics optimization. For example, to compute loss before optimization, or to compute all the metrics before logging :)

Parameters: stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc

Returns: # noqa: DAR202: OrderedDict[str, Callback]: Ordered dictionary # noqa: DAR202 with callbacks for current stage.

Note

To learn more about Catalyst Core concepts, please check out

catalyst.core.experiment.IExperiment

catalyst.core.runner.IRunner

catalyst.core.callback.Callback

Parameters

stage (str) – stage name of interest, like “pretrain” / “train” / “finetune” / etc

Returns

Ordered dictionary: with callbacks for current stage.

Return type

OrderedDict[str, Callback]

abstract get_criterion(stage: str) → torch.nn.modules.module.Module[source]¶

Returns the criterion for a given stage.

Example:

# for typical classification task
>>> experiment.get_criterion(stage="training")
nn.CrossEntropyLoss()

Parameters: stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc

Returns: # noqa: DAR202: Criterion: criterion for a given stage.

get_datasets(stage: str, epoch: int = None, **kwargs) → OrderedDict[str, Dataset][source]¶

Returns the datasets for a given stage and epoch. # noqa: DAR401

Note

For Deep Learning cases you have the same dataset during whole stage.

For Reinforcement Learning it common to change the dataset (experiment) every training epoch.

Parameters

stage (str) – stage name of interest, like “pretrain” / “train” / “finetune” / etc
epoch (int) – epoch index
**kwargs (dict) – additional parameters to use during dataset creation

Returns: # noqa: DAR202

OrderedDict[str, Dataset]: Ordered dictionary: with datasets for current stage and epoch.

Note

We need ordered dictionary to guarantee the correct dataflow and order of our training datasets. For example, to run through train data before validation one :)

Example:

>>> experiment.get_datasets(
>>>     stage="training",
>>>     in_csv_train="path/to/train/csv",
>>>     in_csv_valid="path/to/valid/csv",
>>> )
OrderedDict({
    "train": CsvDataset(in_csv=in_csv_train, ...),
    "valid": CsvDataset(in_csv=in_csv_valid, ...),
})

get_experiment_components(stage: str, model: torch.nn.modules.module.Module = None) → Tuple[torch.nn.modules.module.Module, torch.nn.modules.module.Module, torch.optim.optimizer.Optimizer, torch.optim.lr_scheduler._LRScheduler][source]¶

Returns the tuple containing criterion, optimizer and scheduler by giving model and stage.

Aggregation method, based on,

catalyst.core.experiment.IExperiment.get_model
catalyst.core.experiment.IExperiment.get_criterion
catalyst.core.experiment.IExperiment.get_optimizer
catalyst.core.experiment.IExperiment.get_scheduler

Parameters

stage (str) – stage name of interest, like “pretrain” / “train” / “finetune” / etc
model (Model) – model to optimize with stage optimizer

Returns

model, criterion, optimizer, scheduler: for a given stage and model

Return type

tuple

abstract get_loaders(stage: str, epoch: int = None) → OrderedDict[str, DataLoader][source]¶

Returns the loaders for a given stage. # noqa: DAR401

Note

Wrapper for catalyst.core.experiment.IExperiment.get_datasets. For most of your experiments you need to rewrite get_datasets method only.

Parameters

stage (str) – stage name of interest, like “pretrain” / “train” / “finetune” / etc
epoch (int) – epoch index

Returns: # noqa: DAR202

OrderedDict[str, DataLoader]: Ordered dictionary: with loaders for current stage and epoch.

abstract get_model(stage: str) → torch.nn.modules.module.Module[source]¶

Returns the model for a given stage.

Example:

# suppose we have typical MNIST model, like
# nn.Sequential(nn.Linear(28*28, 128), nn.Linear(128, 10))
>>> experiment.get_model(stage="training")
Sequential(
  (0): Linear(in_features=784, out_features=128, bias=True)
  (1): Linear(in_features=128, out_features=10, bias=True)
)

Parameters: stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc

Returns: # noqa: DAR202: Model: model for a given stage.

abstract get_optimizer(stage: str, model: torch.nn.modules.module.Module) → torch.optim.optimizer.Optimizer[source]¶

Returns the optimizer for a given stage and model.

Example:

>>> experiment.get_optimizer(stage="training", model=model)
torch.optim.Adam(model.parameters())

Parameters

stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc
model (Model) – model to optimize with stage optimizer

Returns: # noqa: DAR202: Optimizer: optimizer for a given stage and model.

abstract get_scheduler(stage: str, optimizer: torch.optim.optimizer.Optimizer) → torch.optim.lr_scheduler._LRScheduler[source]¶

Returns the scheduler for a given stage and optimizer.

Example::

>>> experiment.get_scheduler(stage="training", optimizer=optimizer)
torch.optim.lr_scheduler.StepLR(optimizer)

Parameters

stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc
optimizer (Optimizer) – optimizer to schedule with stage scheduler

Returns: # noqa: DAR202: Scheduler: scheduler for a given stage and optimizer.

abstract get_stage_params(stage: str) → Mapping[str, Any][source]¶

Returns extra stage parameters for a given stage.

Example:

>>> experiment.get_stage_params(stage="training")
{
    "logdir": "./logs/training",
    "num_epochs": 42,
    "valid_loader": "valid",
    "main_metric": "loss",
    "minimize_metric": True,
    "checkpoint_data": {
        "comment": "break the cycle - use the Catalyst"
    }
}

Parameters: stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc

Returns: # noqa: DAR202: dict: parameters for a given stage.

get_transforms(stage: str = None, dataset: str = None)[source]¶

Returns the data transforms for a given stage and dataset.

# noqa: DAR401, W505

Parameters

stage (str) – stage name of interest, like “pretrain” / “train” / “finetune” / etc
dataset (str) – dataset name of interest, like “train” / “valid” / “infer”

Note

For datasets/loaders nameing please follow catalyst.core.runner documentation.

Returns: # noqa: DAR202: Data transformations to use for specified dataset.

abstract property hparams¶

Returns hyper-parameters

Example::

>>> experiment.hparams
OrderedDict([('optimizer', 'Adam'),
 ('lr', 0.02),
 ('betas', (0.9, 0.999)),
 ('eps', 1e-08),
 ('weight_decay', 0),
 ('amsgrad', False),
 ('train_batch_size', 32)])

abstract property initial_seed¶

Experiment’s initial seed, used to setup global seed at the beginning of each stage. Additionally, Catalyst Runner setups experiment.initial_seed + runner.global_epoch + 1 as global seed each epoch. Used for experiment reproducibility.

Example:

>>> experiment.initial_seed
42

abstract property logdir¶

Path to the directory where the experiment logs would be saved.

Example:

>>> experiment.logdir
./path/to/my/experiment/logs

abstract property stages¶

Experiment’s stage names.

Example:

>>> experiment.stages
["pretraining", "training", "finetuning"]

Note

To understand stages concept, please follow Catalyst documentation, for example, catalyst.core.callback.Callback

class catalyst.core.experiment.IExperiment[source]

Bases: abc.ABC

An abstraction that contains information about the experiment – a model, a criterion, an optimizer, a scheduler, and their hyperparameters. It also contains information about the data and transformations used. In general, the Experiment knows what you would like to run.

Note

To learn more about Catalyst Core concepts, please check out

catalyst.core.experiment.IExperiment

catalyst.core.runner.IRunner

catalyst.core.callback.Callback

Abstraction, please check out the implementations:

catalyst.dl.experiment.base.BaseExperiment

catalyst.dl.experiment.config.ConfigExperiment

catalyst.dl.experiment.supervised.SupervisedExperiment

abstract property distributed_params

Dictionary with the parameters for distributed and half-precision training.

Used in catalyst.utils.distributed.process_components to setup Nvidia Apex or PyTorch distributed.

Example:

>>> experiment.distributed_params
{"opt_level": "O1", "syncbn": True}  # Apex variant

abstract get_callbacks(stage: str) → OrderedDict[str, Callback][source]

Returns callbacks for a given stage.

Note

To learn more about Catalyst Callbacks mechanism, please follow catalyst.core.callback.Callback documentation.

Note

We need ordered dictionary to guarantee the correct dataflow and order of metrics optimization. For example, to compute loss before optimization, or to compute all the metrics before logging :)

Parameters: stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc

Returns: # noqa: DAR202: OrderedDict[str, Callback]: Ordered dictionary # noqa: DAR202 with callbacks for current stage.

Note

To learn more about Catalyst Core concepts, please check out

catalyst.core.experiment.IExperiment

catalyst.core.runner.IRunner

catalyst.core.callback.Callback

Parameters

stage (str) – stage name of interest, like “pretrain” / “train” / “finetune” / etc

Returns

Ordered dictionary: with callbacks for current stage.

Return type

OrderedDict[str, Callback]

abstract get_criterion(stage: str) → torch.nn.modules.module.Module[source]

Returns the criterion for a given stage.

Example:

# for typical classification task
>>> experiment.get_criterion(stage="training")
nn.CrossEntropyLoss()

Parameters: stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc

Returns: # noqa: DAR202: Criterion: criterion for a given stage.

get_datasets(stage: str, epoch: int = None, **kwargs) → OrderedDict[str, Dataset][source]

Returns the datasets for a given stage and epoch. # noqa: DAR401

Note

For Deep Learning cases you have the same dataset during whole stage.

For Reinforcement Learning it common to change the dataset (experiment) every training epoch.

Parameters

stage (str) – stage name of interest, like “pretrain” / “train” / “finetune” / etc
epoch (int) – epoch index
**kwargs (dict) – additional parameters to use during dataset creation

Returns: # noqa: DAR202

OrderedDict[str, Dataset]: Ordered dictionary: with datasets for current stage and epoch.

Note

We need ordered dictionary to guarantee the correct dataflow and order of our training datasets. For example, to run through train data before validation one :)

Example:

>>> experiment.get_datasets(
>>>     stage="training",
>>>     in_csv_train="path/to/train/csv",
>>>     in_csv_valid="path/to/valid/csv",
>>> )
OrderedDict({
    "train": CsvDataset(in_csv=in_csv_train, ...),
    "valid": CsvDataset(in_csv=in_csv_valid, ...),
})

get_experiment_components(stage: str, model: torch.nn.modules.module.Module = None) → Tuple[torch.nn.modules.module.Module, torch.nn.modules.module.Module, torch.optim.optimizer.Optimizer, torch.optim.lr_scheduler._LRScheduler][source]

Returns the tuple containing criterion, optimizer and scheduler by giving model and stage.

Aggregation method, based on,

catalyst.core.experiment.IExperiment.get_model
catalyst.core.experiment.IExperiment.get_criterion
catalyst.core.experiment.IExperiment.get_optimizer
catalyst.core.experiment.IExperiment.get_scheduler

Parameters

stage (str) – stage name of interest, like “pretrain” / “train” / “finetune” / etc
model (Model) – model to optimize with stage optimizer

Returns

model, criterion, optimizer, scheduler: for a given stage and model

Return type

tuple

abstract get_loaders(stage: str, epoch: int = None) → OrderedDict[str, DataLoader][source]

Returns the loaders for a given stage. # noqa: DAR401

Note

Wrapper for catalyst.core.experiment.IExperiment.get_datasets. For most of your experiments you need to rewrite get_datasets method only.

Parameters

stage (str) – stage name of interest, like “pretrain” / “train” / “finetune” / etc
epoch (int) – epoch index

Returns: # noqa: DAR202

OrderedDict[str, DataLoader]: Ordered dictionary: with loaders for current stage and epoch.

abstract get_model(stage: str) → torch.nn.modules.module.Module[source]

Returns the model for a given stage.

Example:

# suppose we have typical MNIST model, like
# nn.Sequential(nn.Linear(28*28, 128), nn.Linear(128, 10))
>>> experiment.get_model(stage="training")
Sequential(
  (0): Linear(in_features=784, out_features=128, bias=True)
  (1): Linear(in_features=128, out_features=10, bias=True)
)

Parameters: stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc

Returns: # noqa: DAR202: Model: model for a given stage.

abstract get_optimizer(stage: str, model: torch.nn.modules.module.Module) → torch.optim.optimizer.Optimizer[source]

Returns the optimizer for a given stage and model.

Example:

>>> experiment.get_optimizer(stage="training", model=model)
torch.optim.Adam(model.parameters())

Parameters

stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc
model (Model) – model to optimize with stage optimizer

Returns: # noqa: DAR202: Optimizer: optimizer for a given stage and model.

abstract get_scheduler(stage: str, optimizer: torch.optim.optimizer.Optimizer) → torch.optim.lr_scheduler._LRScheduler[source]

Returns the scheduler for a given stage and optimizer.

Example::

>>> experiment.get_scheduler(stage="training", optimizer=optimizer)
torch.optim.lr_scheduler.StepLR(optimizer)

Parameters

stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc
optimizer (Optimizer) – optimizer to schedule with stage scheduler

Returns: # noqa: DAR202: Scheduler: scheduler for a given stage and optimizer.

abstract get_stage_params(stage: str) → Mapping[str, Any][source]

Returns extra stage parameters for a given stage.

Example:

>>> experiment.get_stage_params(stage="training")
{
    "logdir": "./logs/training",
    "num_epochs": 42,
    "valid_loader": "valid",
    "main_metric": "loss",
    "minimize_metric": True,
    "checkpoint_data": {
        "comment": "break the cycle - use the Catalyst"
    }
}

Parameters: stage (str) – stage name of interest like “pretrain” / “train” / “finetune” / etc

Returns: # noqa: DAR202: dict: parameters for a given stage.

get_transforms(stage: str = None, dataset: str = None)[source]

Returns the data transforms for a given stage and dataset.

# noqa: DAR401, W505

Parameters

stage (str) – stage name of interest, like “pretrain” / “train” / “finetune” / etc
dataset (str) – dataset name of interest, like “train” / “valid” / “infer”

Note

For datasets/loaders nameing please follow catalyst.core.runner documentation.

Returns: # noqa: DAR202: Data transformations to use for specified dataset.

abstract property hparams

Returns hyper-parameters

Example::

>>> experiment.hparams
OrderedDict([('optimizer', 'Adam'),
 ('lr', 0.02),
 ('betas', (0.9, 0.999)),
 ('eps', 1e-08),
 ('weight_decay', 0),
 ('amsgrad', False),
 ('train_batch_size', 32)])

abstract property initial_seed

Experiment’s initial seed, used to setup global seed at the beginning of each stage. Additionally, Catalyst Runner setups experiment.initial_seed + runner.global_epoch + 1 as global seed each epoch. Used for experiment reproducibility.

Example:

>>> experiment.initial_seed
42

abstract property logdir

Path to the directory where the experiment logs would be saved.

Example:

>>> experiment.logdir
./path/to/my/experiment/logs

abstract property stages

Experiment’s stage names.

Example:

>>> experiment.stages
["pretraining", "training", "finetuning"]

Note

To understand stages concept, please follow Catalyst documentation, for example, catalyst.core.callback.Callback

Runner ¶

class catalyst.core.runner.IRunner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, **kwargs)[source]¶

Bases: abc.ABC, catalyst.core.legacy.IRunnerLegacy, catalyst.tools.frozen_class.FrozenClass

An abstraction that knows how to run an experiment. It contains all the logic of how to run the experiment, stages, epoch and batches.

Note

To learn more about Catalyst Core concepts, please check out

catalyst.core.experiment.IExperiment

catalyst.core.runner.IRunner

catalyst.core.callback.Callback

Abstraction, please check out the implementations:

catalyst.dl.runner.runner.Runner

catalyst.dl.runner.supervised.SupervisedRunner

Runner also contains full information about experiment runner.

Runner section

runner.model - an instance of torch.nn.Module class, (should implement forward method); for example,

runner.model = torch.nn.Linear(10, 10)

runner.device - an instance of torch.device (CPU, GPU, TPU); for example,

runner.device = torch.device("cpu")

Experiment section

runner.criterion - an instance of torch.nn.Module class or torch.nn.modules.loss._Loss (should implement forward method); for example,

runner.criterion = torch.nn.CrossEntropyLoss()

runner.optimizer - an instance of torch.optim.optimizer.Optimizer (should implement step method); for example,

runner.optimizer = torch.optim.Adam()

runner.scheduler - an instance of torch.optim.lr_scheduler._LRScheduler (should implement step method); for example,

runner.scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau()

runner.callbacks - ordered dictionary with Catalyst.Callback instances; for example,

runner.callbacks = {
    "accuracy": AccuracyCallback(),
    "criterion": CriterionCallback(),
    "optim": OptimizerCallback(),
    "saver": CheckpointCallback()
}

Dataflow section

runner.loaders - ordered dictionary with torch.DataLoaders; for example,

runner.loaders = {
    "train": MnistTrainLoader(),
    "valid": MnistValidLoader()
}

Note

“train” prefix is used for training loaders - metrics computations, backward pass, optimization
“valid” prefix is used for validation loaders - metrics computations only
“infer” prefix is used for inference loaders - dataset prediction

runner.input - dictionary, containing batch of data from currents DataLoader; for example,

runner.input = {
    "images": np.ndarray(batch_size, c, h, w),
    "targets": np.ndarray(batch_size, 1),
}

runner.output - dictionary, containing model output for current batch; for example,

runner.output = {"logits": torch.Tensor(batch_size, num_classes)}

Metrics section

runner.batch_metrics - dictionary, flatten storage for batch metrics; for example,

runner.batch_metrics = {"loss": ..., "accuracy": ..., "iou": ...}

runner.loader_metrics - dictionary with aggregated batch statistics for loader (mean over all batches) and global loader metrics, like AUC; for example,

runner.loader_metrics = {"loss": ..., "accuracy": ..., "auc": ...}

runner.epoch_metrics - dictionary with summarized metrics for different loaders and global epoch metrics, like lr, momentum; for example,

runner.epoch_metrics = {
    "train_loss": ..., "train_auc": ..., "valid_loss": ...,
    "lr": ..., "momentum": ...,
}

Validation metrics section

runner.main_metric - string, containing name of metric of interest for optimization, validation and checkpointing during training

runner.minimize_metric - bool, indicator flag

True if we need to minimize metric during training, like Cross Entropy loss

False if we need to maximize metric during training, like Accuracy or Intersection over Union

Validation section

runner.valid_loader - string, name of validation loader for metric selection, validation and model checkpoining

runner.valid_metrics - dictionary with validation metrics for currect epoch; for example,

runner.valid_metrics = {"loss": ..., "accuracy": ..., "auc": ...}

Note

subdictionary of epoch_metrics

runner.is_best_valid - bool, indicator flag

True if this training epoch is best over all epochs

False if not

runner.best_valid_metrics - dictionary with best validation metrics during whole training process

Distributed section

runner.distributed_rank - distributed rank of current worker

runner.is_distributed_master - bool, indicator flag

True if is master node (runner.distributed_rank == 0)

False if is worker node (runner.distributed_rank != 0)

runner.is_distributed_worker - bool, indicator flag

True if is worker node (runner.distributed_rank > 0)

False if is master node (runner.distributed_rank <= 0)

Experiment info section

runner.global_sample_step - int, numerical indicator, counter for all individual samples, that passes through our model during training, validation and inference stages

runner.global_batch_step - int, numerical indicator, counter for all batches, that passes through our model during training, validation and inference stages

runner.global_epoch - int, numerical indicator, counter for all epochs, that have passed during model training, validation and inference stages

runner.verbose - bool, indicator flag

runner.is_check_run - bool, indicator flag

True if you want to check you pipeline and run only 2 batches per loader and 2 epochs per stage

False (default) if you want to just the pipeline

runner.need_early_stop - bool, indicator flag used for EarlyStopping and CheckRun Callbacks

True if we need to stop the training

False (default) otherwise

runner.need_exception_reraise - bool, indicator flag

True (default) if you want to show exception during pipeline and stop the training process

False otherwise

Stage info section

runner.stage_name - string, current stage name, for example,

runner.stage_name = "pretraining" / "training" / "finetuning" / etc

runner.num_epochs - int, maximum number of epochs, required for this stage

runner.is_infer_stage - bool, indicator flag

True for inference stages

False otherwise

Epoch info section

runner.epoch - int, numerical indicator for current stage epoch

Loader info section

runner.loader_sample_step - int, numerical indicator for number of samples passed through our model in current loader

runner.loader_batch_step - int, numerical indicator for batch index in current loader

runner.loader_name - string, current loader name for example,

runner.loader_name = "train_dataset1" / "valid_data2" / "infer_golden"

runner.loader_len - int, maximum number of batches in current loader

runner.loader_batch_size - int, batch size parameter in current loader

runner.is_train_loader - bool, indicator flag

True for training loaders

False otherwise

runner.is_valid_loader - bool, indicator flag

True for validation loaders

False otherwise

runner.is_infer_loader - bool, indicator flag

True for inference loaders

False otherwise

Batch info section

runner.batch_size - int, length of the current batch

Logging section

runner.logdir - string, path to logging directory to save all logs, metrics, checkpoints and artifacts

runner.checkpoint_data - dictionary with all extra data for experiment tracking

Extra section

runner.exception - python Exception instance to raise (or not ;) )

__init__(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, **kwargs)[source]¶

Parameters

model (RunnerModel) – Torch model object
device (Device) – Torch device

property device¶: Returns the runner’s device instance.

get_attr(key: str, inner_key: str = None) → Any[source]¶

Alias for python getattr method. Useful for Callbacks preparation and cases with multi-criterion, multi-optimizer setup. For example, when you would like to train multi-task classification.

Used to get a named attribute from a IRunner by key keyword; for example

# example 1
runner.get_attr("criterion")
# is equivalent to
runner.criterion

# example 2
runner.get_attr("optimizer")
# is equivalent to
runner.optimizer

# example 3
runner.get_attr("scheduler")
# is equivalent to
runner.scheduler

With inner_key usage, it suppose to find a dictionary under key and would get inner_key from this dict; for example,

# example 1
runner.get_attr("criterion", "bce")
# is equivalent to
runner.criterion["bce"]

# example 2
runner.get_attr("optimizer", "adam")
# is equivalent to
runner.optimizer["adam"]

# example 3
runner.get_attr("scheduler", "adam")
# is equivalent to
runner.scheduler["adam"]

Parameters

key (str) – name for attribute of interest, like criterion, optimizer, scheduler
inner_key (str) – name of inner dictionary key

Returns

inner attribute

property model¶: Returns the runner’s model instance.

run_experiment(experiment: catalyst.core.experiment.IExperiment = None) → catalyst.core.runner.IRunner[source]¶

Starts the experiment.

Parameters

experiment (IExperiment) – Experiment instance to use for Runner.

Returns

self, IRunner instance after the experiment

Raises

Exception – if during pipeline exception, no handler we found into callbacks
KeyboardInterrupt – if during pipeline exception, no handler we found into callbacks

class catalyst.core.runner.IStageBasedRunner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, **kwargs)[source]¶

Bases: catalyst.core.runner.IRunner

Runner abstraction that suppose to have constant datasources per stage.

class catalyst.core.runner.IRunner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, **kwargs)[source]

Bases: abc.ABC, catalyst.core.legacy.IRunnerLegacy, catalyst.tools.frozen_class.FrozenClass

An abstraction that knows how to run an experiment. It contains all the logic of how to run the experiment, stages, epoch and batches.

Note

To learn more about Catalyst Core concepts, please check out

catalyst.core.experiment.IExperiment

catalyst.core.runner.IRunner

catalyst.core.callback.Callback

Abstraction, please check out the implementations:

catalyst.dl.runner.runner.Runner

catalyst.dl.runner.supervised.SupervisedRunner

Runner also contains full information about experiment runner.

Runner section

runner.model - an instance of torch.nn.Module class, (should implement forward method); for example,

runner.model = torch.nn.Linear(10, 10)

runner.device - an instance of torch.device (CPU, GPU, TPU); for example,

runner.device = torch.device("cpu")

Experiment section

runner.criterion - an instance of torch.nn.Module class or torch.nn.modules.loss._Loss (should implement forward method); for example,

runner.criterion = torch.nn.CrossEntropyLoss()

runner.optimizer - an instance of torch.optim.optimizer.Optimizer (should implement step method); for example,

runner.optimizer = torch.optim.Adam()

runner.scheduler - an instance of torch.optim.lr_scheduler._LRScheduler (should implement step method); for example,

runner.scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau()

runner.callbacks - ordered dictionary with Catalyst.Callback instances; for example,

runner.callbacks = {
    "accuracy": AccuracyCallback(),
    "criterion": CriterionCallback(),
    "optim": OptimizerCallback(),
    "saver": CheckpointCallback()
}

Dataflow section

runner.loaders - ordered dictionary with torch.DataLoaders; for example,

runner.loaders = {
    "train": MnistTrainLoader(),
    "valid": MnistValidLoader()
}

Note

“train” prefix is used for training loaders - metrics computations, backward pass, optimization
“valid” prefix is used for validation loaders - metrics computations only
“infer” prefix is used for inference loaders - dataset prediction

runner.input - dictionary, containing batch of data from currents DataLoader; for example,

runner.input = {
    "images": np.ndarray(batch_size, c, h, w),
    "targets": np.ndarray(batch_size, 1),
}

runner.output - dictionary, containing model output for current batch; for example,

runner.output = {"logits": torch.Tensor(batch_size, num_classes)}

Metrics section

runner.batch_metrics - dictionary, flatten storage for batch metrics; for example,

runner.batch_metrics = {"loss": ..., "accuracy": ..., "iou": ...}

runner.loader_metrics - dictionary with aggregated batch statistics for loader (mean over all batches) and global loader metrics, like AUC; for example,

runner.loader_metrics = {"loss": ..., "accuracy": ..., "auc": ...}

runner.epoch_metrics - dictionary with summarized metrics for different loaders and global epoch metrics, like lr, momentum; for example,

runner.epoch_metrics = {
    "train_loss": ..., "train_auc": ..., "valid_loss": ...,
    "lr": ..., "momentum": ...,
}

Validation metrics section

runner.main_metric - string, containing name of metric of interest for optimization, validation and checkpointing during training

runner.minimize_metric - bool, indicator flag

True if we need to minimize metric during training, like Cross Entropy loss

False if we need to maximize metric during training, like Accuracy or Intersection over Union

Validation section

runner.valid_loader - string, name of validation loader for metric selection, validation and model checkpoining

runner.valid_metrics - dictionary with validation metrics for currect epoch; for example,

runner.valid_metrics = {"loss": ..., "accuracy": ..., "auc": ...}

Note

subdictionary of epoch_metrics

runner.is_best_valid - bool, indicator flag

True if this training epoch is best over all epochs

False if not

runner.best_valid_metrics - dictionary with best validation metrics during whole training process

Distributed section

runner.distributed_rank - distributed rank of current worker

runner.is_distributed_master - bool, indicator flag

True if is master node (runner.distributed_rank == 0)

False if is worker node (runner.distributed_rank != 0)

runner.is_distributed_worker - bool, indicator flag

True if is worker node (runner.distributed_rank > 0)

False if is master node (runner.distributed_rank <= 0)

Experiment info section

runner.global_sample_step - int, numerical indicator, counter for all individual samples, that passes through our model during training, validation and inference stages

runner.global_batch_step - int, numerical indicator, counter for all batches, that passes through our model during training, validation and inference stages

runner.global_epoch - int, numerical indicator, counter for all epochs, that have passed during model training, validation and inference stages

runner.verbose - bool, indicator flag

runner.is_check_run - bool, indicator flag

True if you want to check you pipeline and run only 2 batches per loader and 2 epochs per stage

False (default) if you want to just the pipeline

runner.need_early_stop - bool, indicator flag used for EarlyStopping and CheckRun Callbacks

True if we need to stop the training

False (default) otherwise

runner.need_exception_reraise - bool, indicator flag

True (default) if you want to show exception during pipeline and stop the training process

False otherwise

Stage info section

runner.stage_name - string, current stage name, for example,

runner.stage_name = "pretraining" / "training" / "finetuning" / etc

runner.num_epochs - int, maximum number of epochs, required for this stage

runner.is_infer_stage - bool, indicator flag

True for inference stages

False otherwise

Epoch info section

runner.epoch - int, numerical indicator for current stage epoch

Loader info section

runner.loader_sample_step - int, numerical indicator for number of samples passed through our model in current loader

runner.loader_batch_step - int, numerical indicator for batch index in current loader

runner.loader_name - string, current loader name for example,

runner.loader_name = "train_dataset1" / "valid_data2" / "infer_golden"

runner.loader_len - int, maximum number of batches in current loader

runner.loader_batch_size - int, batch size parameter in current loader

runner.is_train_loader - bool, indicator flag

True for training loaders

False otherwise

runner.is_valid_loader - bool, indicator flag

True for validation loaders

False otherwise

runner.is_infer_loader - bool, indicator flag

True for inference loaders

False otherwise

Batch info section

runner.batch_size - int, length of the current batch

Logging section

runner.logdir - string, path to logging directory to save all logs, metrics, checkpoints and artifacts

runner.checkpoint_data - dictionary with all extra data for experiment tracking

Extra section

runner.exception - python Exception instance to raise (or not ;) )

__init__(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, **kwargs)[source]

Parameters

model (RunnerModel) – Torch model object
device (Device) – Torch device

property device: Returns the runner’s device instance.

get_attr(key: str, inner_key: str = None) → Any[source]

Alias for python getattr method. Useful for Callbacks preparation and cases with multi-criterion, multi-optimizer setup. For example, when you would like to train multi-task classification.

Used to get a named attribute from a IRunner by key keyword; for example

# example 1
runner.get_attr("criterion")
# is equivalent to
runner.criterion

# example 2
runner.get_attr("optimizer")
# is equivalent to
runner.optimizer

# example 3
runner.get_attr("scheduler")
# is equivalent to
runner.scheduler

With inner_key usage, it suppose to find a dictionary under key and would get inner_key from this dict; for example,

# example 1
runner.get_attr("criterion", "bce")
# is equivalent to
runner.criterion["bce"]

# example 2
runner.get_attr("optimizer", "adam")
# is equivalent to
runner.optimizer["adam"]

# example 3
runner.get_attr("scheduler", "adam")
# is equivalent to
runner.scheduler["adam"]

Parameters

key (str) – name for attribute of interest, like criterion, optimizer, scheduler
inner_key (str) – name of inner dictionary key

Returns

inner attribute

property model: Returns the runner’s model instance.

run_experiment(experiment: catalyst.core.experiment.IExperiment = None) → catalyst.core.runner.IRunner[source]

Starts the experiment.

Parameters

experiment (IExperiment) – Experiment instance to use for Runner.

Returns

self, IRunner instance after the experiment

Raises

Exception – if during pipeline exception, no handler we found into callbacks
KeyboardInterrupt – if during pipeline exception, no handler we found into callbacks

class catalyst.core.runner.IStageBasedRunner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, device: Union[str, torch.device] = None, **kwargs)[source]

Bases: catalyst.core.runner.IRunner

Runner abstraction that suppose to have constant datasources per stage.

exception catalyst.core.runner.RunnerException(message: str)[source]¶

Bases: Exception

Exception class for all runner errors.

__init__(message: str)[source]¶

Parameters: message – exception message

Callback ¶

class catalyst.core.callback.Callback(order: int, node: int = <CallbackNode.All: 0>, scope: int = <CallbackScope.Stage: 0>)[source]¶

Bases: object

An abstraction that lets you customize your experiment run logic. To give users maximum flexibility and extensibility Catalyst supports callback execution anywhere in the training loop:

-- stage start
---- epoch start
------ loader start
-------- batch start
---------- batch handler (Runner logic)
-------- batch end
------ loader end
---- epoch end
-- stage end

exception – if an Exception was raised

All callbacks have

order from CallbackOrder
node from CallbackNode
scope from CallbackScope

Note

To learn more about Catalyst Core concepts, please check out

catalyst.core.experiment.IExperiment

catalyst.core.runner.IRunner

catalyst.core.callback.Callback

Abstraction, please check out the implementations:

catalyst.core.callbacks.criterion.CriterionCallback

catalyst.core.callbacks.optimizer.OptimizerCallback

catalyst.core.callbacks.scheduler.SchedulerCallback

catalyst.core.callbacks.logging.TensorboardLogger

catalyst.core.callbacks.checkpoint.CheckpointCallback

__init__(order: int, node: int = <CallbackNode.All: 0>, scope: int = <CallbackScope.Stage: 0>)[source]¶

Callback initializer.

Parameters

order – flag from CallbackOrder
node – flag from CallbackNode
scope – flag from CallbackScope

on_batch_end(runner: IRunner)[source]¶

Event handler for batch end.

Parameters: runner ("IRunner") – IRunner instance.

on_batch_start(runner: IRunner)[source]¶

Event handler for batch start.

Parameters: runner ("IRunner") – IRunner instance.

on_epoch_end(runner: IRunner)[source]¶

Event handler for epoch end.

Parameters: runner ("IRunner") – IRunner instance.

on_epoch_start(runner: IRunner)[source]¶

Event handler for epoch start.

Parameters: runner ("IRunner") – IRunner instance.

on_exception(runner: IRunner)[source]¶

Event handler for exception case.

Parameters: runner ("IRunner") – IRunner instance.

on_loader_end(runner: IRunner)[source]¶

Event handler for loader end.

Parameters: runner ("IRunner") – IRunner instance.

on_loader_start(runner: IRunner)[source]¶

Event handler for loader start.

Parameters: runner ("IRunner") – IRunner instance.

on_stage_end(runner: IRunner)[source]¶

Event handler for stage end.

Parameters: runner ("IRunner") – IRunner instance.

on_stage_start(runner: IRunner)[source]¶

Event handler for stage start.

Parameters: runner ("IRunner") – IRunner instance.

class catalyst.core.callback.CallbackNode[source]¶

Bases: enum.IntFlag

Callback node usage flag during distributed training.

All (0) - use on all nodes, botch master and worker.
Master (1) - use only on master node.
Worker (2) - use only in worker nodes.

All = 0¶

Master = 1¶

Worker = 2¶

all = 0¶

master = 1¶

worker = 2¶

class catalyst.core.callback.CallbackOrder[source]¶

Bases: enum.IntFlag

Callback usage order during training.

Catalyst executes Callbacks with low CallbackOrder before Callbacks with high CallbackOrder.

Predefined orders:

Internal (0) - some Catalyst Extras, like PhaseCallbacks (used in GANs).
Metric (20) - Callbacks with metrics and losses computation.
MetricAggregation (40) - metrics aggregation callbacks, like sum different losses into one.
Optimizer (60) - optimizer step, requires computed metrics for optimization.
Validation (80) - validation step, computes validation metrics subset based on all metrics.
Scheduler (100) - scheduler step, in ReduceLROnPlateau case requires computed validation metrics for optimizer schedule.
Logging (120) - logging step, logs metrics to Console/Tensorboard/Alchemy, requires computed metrics.
External (200) - additional callbacks with custom logic, like InferenceCallbacks

Nevertheless, you always can create CustomCallback with any order, for example:

>>> class MyCustomCallback(Callback):
>>>     def __init__(self):
>>>         super().__init__(order=42)
>>>     ...
# MyCustomCallback will be executed after all `Metric`-Callbacks
# but before all `MetricAggregation`-Callbacks.

External = 200¶

Internal = 0¶

Logging = 120¶

Metric = 20¶

MetricAggregation = 40¶

Optimizer = 60¶

Scheduler = 100¶

Validation = 80¶

external = 200¶

internal = 0¶

logging = 120¶

metric = 20¶

metric_aggregation = 40¶

optimizer = 60¶

scheduler = 100¶

validation = 80¶

class catalyst.core.callback.CallbackScope[source]¶

Bases: enum.IntFlag

Callback scope usage flag during training.

Stage (0) - use Callback only during one experiment stage.
Experiment (1) - use Callback during whole experiment run.

Experiment = 1¶

Stage = 0¶

experiment = 1¶

stage = 0¶

class catalyst.core.callback.WrapperCallback(base_callback: catalyst.core.callback.Callback, enable_callback: bool = True)[source]¶

Bases: catalyst.core.callback.Callback

Enable/disable callback execution.

__init__(base_callback: catalyst.core.callback.Callback, enable_callback: bool = True)[source]¶

Parameters

base_callback (Callback) – callback to wrap
enable_callback (boolean) – indicator to enable/disable callback, if True then callback will be enabled, default True

on_batch_end(runner: IRunner) → None[source]¶

Run base_callback (if possible)

Parameters: runner (IRunner) – current runner

on_batch_start(runner: IRunner) → None[source]¶

Run base_callback (if possible)

Parameters: runner (IRunner) – current runner

on_epoch_end(runner: IRunner) → None[source]¶

Run base_callback (if possible)

Parameters: runner (IRunner) – current runner

on_epoch_start(runner: IRunner) → None[source]¶

Run base_callback (if possible)

Parameters: runner (IRunner) – current runner

on_exception(runner: IRunner) → None[source]¶

Run base_callback (if possible)

Parameters: runner (IRunner) – current runner

on_loader_end(runner: IRunner) → None[source]¶

Reset status of callback

Parameters: runner (IRunner) – current runner

on_loader_start(runner: IRunner) → None[source]¶

Check if current epoch should be skipped.

Parameters: runner (IRunner) – current runner

on_stage_end(runner: IRunner) → None[source]¶

Run base_callback (if possible)

Parameters: runner (IRunner) – current runner

on_stage_start(runner: IRunner) → None[source]¶

Run base_callback (if possible)

Parameters: runner (IRunner) – current runner

Callbacks ¶

BatchOverfitCallback ¶

class catalyst.core.callbacks.batch_overfit.BatchOverfitCallback(**kwargs)[source]¶

Bases: catalyst.core.callback.Callback

Callback for ovefitting loaders with specified number of batches. By default we use 1 batch for loader.

For example, if you have train, train_additional, valid and valid_additional loaders and wan’t to overfit train on first 1 batch, train_additional on first 2 batches, valid - on first 20% of batches and valid_additional - on 50% batches:

from catalyst.dl import (
    SupervisedRunner, BatchOverfitCallback,
)
runner = SupervisedRunner()
runner.train(
    ...
    loaders={
        "train": ...,
        "train_additional": ...,
        "valid": ...,
        "valid_additional":...
    }
    ...
    callbacks=[
        ...
        BatchOverfitCallback(
            train_additional=2,
            valid=0.2,
            valid_additional=0.5
        ),
        ...
    ]
    ...
)

Minimal working example

import torch
from torch.utils.data import DataLoader, TensorDataset
from catalyst import dl

# data
num_samples, num_features = int(1e4), int(1e1)
X, y = torch.rand(num_samples, num_features), torch.rand(num_samples)
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=32, num_workers=1)
loaders = {"train": loader, "valid": loader}

# model, criterion, optimizer, scheduler
model = torch.nn.Linear(num_features, 1)
criterion = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters())
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [3, 6])

# model training
runner = dl.SupervisedRunner()
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    loaders=loaders,
    logdir="./logdir",
    num_epochs=8,
    verbose=True,
    callbacks=[dl.BatchOverfitCallback(train=10, valid=0.5)]
)

__init__(**kwargs)[source]¶

Parameters: kwargs – loader names and their number of batches to overfit.

on_epoch_end(runner: catalyst.core.runner.IRunner)[source]¶

Unwraps loaders for current epoch.

Parameters: runner (IRunner) – current runner

on_epoch_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Wraps loaders for current epoch. If number-of-batches for loader is not provided then the first batch from loader will be used for overfitting.

Parameters: runner (IRunner) – current runner

Checkpoint ¶

class catalyst.core.callbacks.checkpoint.CheckpointCallback(save_n_best: int = 1, resume: str = None, resume_dir: str = None, metrics_filename: str = '_metrics.json', load_on_stage_start: Union[str, Dict[str, str]] = None, load_on_stage_end: Union[str, Dict[str, str]] = None)[source]¶

Bases: catalyst.core.callbacks.checkpoint.BaseCheckpointCallback

Checkpoint callback to save/restore your model/criterion/optimizer/scheduler.

__init__(save_n_best: int = 1, resume: str = None, resume_dir: str = None, metrics_filename: str = '_metrics.json', load_on_stage_start: Union[str, Dict[str, str]] = None, load_on_stage_end: Union[str, Dict[str, str]] = None)[source]¶

Parameters

save_n_best (int) – number of best checkpoint to keep, if 0 then store only last state of model and load_on_stage_end should be one of last or last_full.
resume (str) – path to checkpoint to load and initialize runner state
resume_dir (str) – directory with checkpoints, if specified in combination with resume than resume checkpoint will be loaded from resume_dir
metrics_filename (str) – filename to save metrics in checkpoint folder. Must ends on .json or .yml
load_on_stage_start (str or Dict[str, str]) –
load specified state/model at stage start.

If passed string then will be performed initialization from specified state (best/best_full/last/last_full) or checkpoint file.

If passed dict then will be performed initialization only for specified parts - model, criterion, optimizer, scheduler.

Example
```
>>> # possible checkpoints to use:
>>> #   "best"/"best_full"/"last"/"last_full"
>>> #   or path to specific checkpoint
>>> to_load = {
>>>    "model": "path/to/checkpoint.pth",
>>>    "criterion": "best",
>>>    "optimizer": "last_full",
>>>    "scheduler": "best_full",
>>> }
>>> CheckpointCallback(load_on_stage_start=to_load)
```
All other keys instead of "model", "criterion", "optimizer" and "scheduler" will be ignored.

If None or an empty dict (or dict without mentioned above keys) then no action is required at stage start and:
- Config API - will be used best state of model
- Notebook API - no action will be performed (will be used the last state)
NOTE: Loading will be performed on all stages except first.

NOTE: Criterion, optimizer and scheduler are optional keys and should be loaded from full checkpoint.

Model state can be loaded from any checkpoint.

When dict contains keys for model and some other part (for example {"model": "last", "optimizer": "last"}) and they match in prefix ("best" and "best_full") then will be loaded full checkpoint because it contains required states.
load_on_stage_end (str or Dict[str, str]) –
load specified state/model at stage end.

If passed string then will be performed initialization from specified state (best/best_full/last/last_full) or checkpoint file.

If passed dict then will be performed initialization only for specified parts - model, criterion, optimizer, scheduler. Logic for dict is the same as for load_on_stage_start.

If None then no action is required at stage end and will be used the last runner.

NOTE: Loading will be performed always at stage end.

on_epoch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Collect and save checkpoint after epoch.

Parameters: runner (IRunner) – current runner

on_stage_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Show information about best checkpoints during the stage and load model specified in load_on_stage_end.

Parameters: runner (IRunner) – current runner

on_stage_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Setup model for stage.

Note

If CheckpointCallback initialized with resume (as path to checkpoint file) or resume (as filename) and resume_dir (as directory with file) then will be performed loading checkpoint.

Parameters: runner (IRunner) – current runner

process_checkpoint(logdir: Union[str, pathlib.Path], checkpoint: Dict, is_best: bool, main_metric: str = 'loss', minimize_metric: bool = True) → None[source]¶

Save checkpoint and metrics.

Parameters

logdir (str or Path object) – directory for storing checkpoints
checkpoint (dict) – dict with checkpoint data
is_best (bool) – indicator to save best checkpoint, if true then will be saved two additional checkpoints - best and best_full.
main_metric (str) – metric to use for selecting the best model
minimize_metric (bool) – indicator for selecting best metric, if true then best metric will be the metric with the lowest value, otherwise with the greatest value.

process_metrics(last_valid_metrics: Dict[str, float]) → Dict[source]¶

Add last validation metrics to list of previous validation metrics and keep save_n_best metrics.

Parameters: last_valid_metrics (dict) – dict with metrics from last validation step.
Returns: processed metrics
Return type: OrderedDict

truncate_checkpoints(minimize_metric: bool) → None[source]¶

Keep save_n_best checkpoints based on main metric.

Parameters: minimize_metric (bool) – if True then keep save_n_best checkpoints with the lowest/highest values of the main metric.

class catalyst.core.callbacks.checkpoint.IterationCheckpointCallback(save_n_last: int = 1, period: int = 100, stage_restart: bool = True, metrics_filename: str = '_metrics_iter.json', load_on_stage_end: str = 'best_full')[source]¶

Bases: catalyst.core.callbacks.checkpoint.BaseCheckpointCallback

Iteration checkpoint callback to save your model/criterion/optimizer.

__init__(save_n_last: int = 1, period: int = 100, stage_restart: bool = True, metrics_filename: str = '_metrics_iter.json', load_on_stage_end: str = 'best_full')[source]¶

Parameters

save_n_last (int) – number of last checkpoint to keep
period (int) – save the checkpoint every period
stage_restart (bool) – restart counter every stage or not
metrics_filename (str) – filename to save metrics in checkpoint folder. Must ends on .json or .yml
load_on_stage_end (str) – name of the model to load at the end of the stage. You can use best, best_full (default) to load the best model according to validation metrics, or last last_full to use just the last one.

on_batch_end(runner: catalyst.core.runner.IRunner)[source]¶

Save checkpoint based on batches count.

Parameters: runner (IRunner) – current runner

on_stage_end(runner: catalyst.core.runner.IRunner)[source]¶

Load model specified in load_on_stage_end.

Parameters: runner (IRunner) – current runner

on_stage_start(runner: catalyst.core.runner.IRunner)[source]¶

Reset iterations counter.

Parameters: runner (IRunner) – current runner

process_checkpoint(logdir: Union[str, pathlib.Path], checkpoint: Dict, batch_metrics: Dict[str, float])[source]¶

Save checkpoint and metrics.

Parameters

logdir (str or Path object) – directory for storing checkpoints
checkpoint (dict) – dict with checkpoint data
batch_metrics (dict) – dict with metrics based on a few batches

process_metrics() → Dict[source]¶

Update metrics with last save_n_last checkpoints.

Returns: updated metrics

truncate_checkpoints(**kwargs) → None[source]¶

Keep save_n_best checkpoints based on main metric.

Parameters: **kwargs – extra params

class catalyst.core.callbacks.checkpoint.ICheckpointCallback(order: int, node: int = <CallbackNode.All: 0>, scope: int = <CallbackScope.Stage: 0>)[source]¶

Bases: catalyst.core.callback.Callback

Checkpoint callback interface, abstraction over model checkpointing step.

class catalyst.core.callbacks.checkpoint.BaseCheckpointCallback(metrics_filename: str = '_metrics.json')[source]¶

Bases: catalyst.core.callbacks.checkpoint.ICheckpointCallback

Base class for all checkpoint callbacks.

__init__(metrics_filename: str = '_metrics.json')[source]¶

Parameters: metrics_filename (str) – filename to save metrics in checkpoint folder. Must ends on .json or .yml

on_exception(runner: catalyst.core.runner.IRunner)[source]¶

Expection handler.

Parameters: runner – current runner

Control Flow ¶

class catalyst.core.callbacks.control_flow.ControlFlowCallback(base_callback: catalyst.core.callback.Callback, epochs: Union[int, Sequence[int]] = None, ignore_epochs: Union[int, Sequence[int]] = None, loaders: Union[str, Sequence[str], Mapping[str, Union[int, Sequence[int]]]] = None, ignore_loaders: Union[str, Sequence[str], Mapping[str, Union[int, Sequence[int]]]] = None, filter_fn: Union[str, Callable[[str, int, str], bool]] = None, use_global_epochs: bool = False)[source]¶

Bases: catalyst.core.callback.WrapperCallback

Enable/disable callback execution on different stages, loaders and epochs.

Note

Please run experiment with check option to check if everything works as expected with this callback.

For example, if you don’t want to compute loss on a validation you can ignore CriterionCallback, for notebook API need to wrap callback:

import torch
from torch.utils.data import DataLoader, TensorDataset
from catalyst.dl import (
    SupervisedRunner, AccuracyCallback,
    CriterionCallback, ControlFlowCallback,
)

num_samples, num_features = 10_000, 10
n_classes = 10
X = torch.rand(num_samples, num_features)
y = torch.randint(0, n_classes, [num_samples])
loader = DataLoader(TensorDataset(X, y), batch_size=32, num_workers=1)
loaders = {"train": loader, "valid": loader}

model = torch.nn.Linear(num_features, n_classes)
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [3, 6])

runner = SupervisedRunner()
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    loaders=loaders,
    logdir="./logdir",
    num_epochs=5,
    verbose=False,
    main_metric="accuracy03",
    minimize_metric=False,
    callbacks=[
        AccuracyCallback(
            accuracy_args=[1, 3, 5]
        ),
        ControlFlowCallback(
            base_callback=CriterionCallback(),
            ignore_loaders="valid"  # or loaders="train"
        )
    ]
)

In config API need to use _wrapper argument:

callbacks_params:
  ...
  loss:
    _wrapper:
       callback: ControlFlowCallback
       ignore_loaders: valid
    callback: CriterionCallback
  ...

__init__(base_callback: catalyst.core.callback.Callback, epochs: Union[int, Sequence[int]] = None, ignore_epochs: Union[int, Sequence[int]] = None, loaders: Union[str, Sequence[str], Mapping[str, Union[int, Sequence[int]]]] = None, ignore_loaders: Union[str, Sequence[str], Mapping[str, Union[int, Sequence[int]]]] = None, filter_fn: Union[str, Callable[[str, int, str], bool]] = None, use_global_epochs: bool = False)[source]¶

Parameters

base_callback (Callback) – callback to wrap
epochs (int/Sequence[int]) –
epochs where need to enable callback, on other epochs callback will be disabled.

If passed int/float then callback will be enabled with period specified as epochs value (epochs expression epoch_number % epochs == 0) and disabled on other epochs.

If passed list of epochs then will be executed callback on specified epochs.

Default value is None.
ignore_epochs –
(int/Sequence[int]): epochs where need to disable callback, on other epochs callback will be enabled.

If passed int/float then callback will be disabled with period specified as epochs value (epochs expression epoch_number % epochs != 0) and enabled on other epochs.

If passed list of epochs then will be disabled callback on specified epochs.

Default value is None.
loaders (str/Sequence[str]/Mapping[str, int/Sequence[str]]) –
loaders where should be enabled callback, on other loaders callback will be disabled.

If passed string object then will be disabled callback for loader with specified name.

If passed list/tuple of strings then will be disabled callback for loaders with specified names.

If passed dictionary where key is a string and values int or list of integers then callback will be disabled on epochs (dictionary value) for specified loader (dictionary key).

Default value is None.
ignore_loaders (str/Sequence[str]/Mapping[str, int/Sequence[str]]) –
loader names where should be disabled callback, on other loaders callback will be enabled.

If passed string object then will be disabled callback for loader with specified name.

If passed list/tuple of strings then will be disabled callback for loaders with specified names.

If passed dictionary where key is a string and values int or list of integers then callback will be disabled on epochs (dictionary value) for specified loader (dictionary key).

Default value is None.
filter_fn (str or Callable[[str, int, str], bool]) –
function to use instead of loaders or epochs arguments.

If the object passed to a filter_fn is a string then it will be interpreted as python code. Expected lambda function with three arguments stage name (str), epoch number (int), loader name (str) and this function should return True if callback should be enabled on some condition.

If passed callable object then it should accept three arguments - stage name (str), epoch number (int), loader name (str) and should return True if callback should be enabled on some condition othervise should return False.

Default value is None.

Examples:
```
# enable callback on all loaders
# exept "train" loader every 2 epochs
ControlFlowCallback(
    ...
    filter_fn=lambda s, e, l: l != "train" and e % 2 == 0
    ...
)
# or with string equivalent
ControlFlowCallback(
    ...
    filter_fn="lambda s, e, l: l != 'train' and e % 2 == 0"
    ...
)
```
use_global_epochs (bool) – if True then will be used global epochs instead of epochs in a stage, the default value is False

on_loader_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Reset status of callback

Parameters: runner (IRunner) – current runner

on_loader_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Check if current epoch should be skipped.

Parameters: runner (IRunner) – current runner

Criterion ¶

class catalyst.core.callbacks.criterion.CriterionCallback(input_key: Union[str, List[str], Dict[str, str]] = 'targets', output_key: Union[str, List[str], Dict[str, str]] = 'logits', prefix: str = 'loss', criterion_key: str = None, multiplier: float = 1.0, **metric_kwargs)[source]¶

Bases: catalyst.core.callbacks.metrics.IBatchMetricCallback

Callback for that measures loss with specified criterion.

__init__(input_key: Union[str, List[str], Dict[str, str]] = 'targets', output_key: Union[str, List[str], Dict[str, str]] = 'logits', prefix: str = 'loss', criterion_key: str = None, multiplier: float = 1.0, **metric_kwargs)[source]¶

Parameters

input_key (Union[str, List[str], Dict[str, str]]) – key/list/dict of keys that takes values from the input dictionary If ‘__all__’, the whole input will be passed to the criterion If None, empty dict will be passed to the criterion.
output_key (Union[str, List[str], Dict[str, str]]) – key/list/dict of keys that takes values from the input dictionary If ‘__all__’, the whole output will be passed to the criterion If None, empty dict will be passed to the criterion.
prefix (str) – prefix for metrics and output key for loss in runner.batch_metrics dictionary
criterion_key (str) – A key to take a criterion in case there are several of them and they are in a dictionary format.
multiplier (float) – scale factor for the output loss.

property metric_fn¶: Criterion function.

on_stage_start(runner: catalyst.core.runner.IRunner)[source]¶

Checks that the current stage has correct criterion.

Parameters: runner (IRunner) – current runner

Early Stop ¶

class catalyst.core.callbacks.early_stop.CheckRunCallback(num_batch_steps: int = 3, num_epoch_steps: int = 2)[source]¶

Bases: catalyst.core.callback.Callback

Executes only a pipeline part from the Experiment.

__init__(num_batch_steps: int = 3, num_epoch_steps: int = 2)[source]¶

Parameters

num_batch_steps (int) – number of batches to iterate in epoch
num_epoch_steps (int) – number of epoch to perform in a stage

on_batch_end(runner: catalyst.core.runner.IRunner)[source]¶

Check if iterated specified number of batches.

Parameters: runner (IRunner) – current runner

on_epoch_end(runner: catalyst.core.runner.IRunner)[source]¶

Check if iterated specified number of epochs.

Parameters: runner (IRunner) – current runner

class catalyst.core.callbacks.early_stop.EarlyStoppingCallback(patience: int, metric: str = 'loss', minimize: bool = True, min_delta: float = 1e-06)[source]¶

Bases: catalyst.core.callback.Callback

Early exit based on metric.

Example of usage in notebook API:

runner = SupervisedRunner()
runner.train(
    ...
    callbacks=[
        ...
        EarlyStoppingCallback(
            patience=5,
            metric="my_metric",
            minimize=True,
        )
        ...
    ]
)
...

Example of usage in config API:

stages:
  ...
  stage_N:
    ...
    callbacks_params:
      ...
      early_stopping:
        callback: EarlyStoppingCallback
        # arguments for EarlyStoppingCallback
        patience: 5
        metric: my_metric
        minimize: true
  ...

__init__(patience: int, metric: str = 'loss', minimize: bool = True, min_delta: float = 1e-06)[source]¶

Parameters

patience (int) – number of epochs with no improvement after which training will be stopped.
metric (str) – metric name to use for early stopping, default is "loss".
minimize (bool) – if True then expected that metric should decrease and early stopping will be performed only when metric stops decreasing. If False then expected that metric should increase. Default value True.
min_delta (float) – minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement, default value is 1e-6.

on_epoch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Check if should be performed early stopping.

Parameters: runner (IRunner) – current runner

Exception ¶

class catalyst.core.callbacks.exception.ExceptionCallback[source]¶

Bases: catalyst.core.callback.Callback

@TODO: Docs. Contribution is welcome.

__init__()[source]¶: @TODO: Docs. Contribution is welcome.

on_exception(runner: catalyst.core.runner.IRunner)[source]¶: @TODO: Docs. Contribution is welcome.

Logging ¶

class catalyst.core.callbacks.logging.ILoggerCallback(order: int, node: int = <CallbackNode.All: 0>, scope: int = <CallbackScope.Stage: 0>)[source]¶

Bases: catalyst.core.callback.Callback

Logger callback interface, abstraction over logging step

class catalyst.core.callbacks.logging.ConsoleLogger[source]¶

Bases: catalyst.core.callbacks.logging.ILoggerCallback

Logger callback, translates runner.*_metrics to console and text file.

__init__()[source]¶: Init ConsoleLogger.

on_epoch_end(runner: catalyst.core.runner.IRunner)[source]¶

Translate runner.metric_manager to console and text file at the end of an epoch.

Parameters: runner (IRunner) – current runner instance

on_stage_end(runner: catalyst.core.runner.IRunner)[source]¶: Called at the end of each stage.

on_stage_start(runner: catalyst.core.runner.IRunner)[source]¶: Prepare runner.logdir for the current stage.

class catalyst.core.callbacks.logging.TensorboardLogger(metric_names: List[str] = None, log_on_batch_end: bool = True, log_on_epoch_end: bool = True)[source]¶

Bases: catalyst.core.callbacks.logging.ILoggerCallback

Logger callback, translates runner.metric_manager to tensorboard.

__init__(metric_names: List[str] = None, log_on_batch_end: bool = True, log_on_epoch_end: bool = True)[source]¶

Parameters

metric_names (List[str]) – list of metric names to log, if none - logs everything
log_on_batch_end (bool) – logs per-batch metrics if set True
log_on_epoch_end (bool) – logs per-epoch metrics if set True

on_batch_end(runner: catalyst.core.runner.IRunner)[source]¶: Translate batch metrics to tensorboard.

on_epoch_end(runner: catalyst.core.runner.IRunner)[source]¶: Translate epoch metrics to tensorboard.

on_loader_start(runner: catalyst.core.runner.IRunner)[source]¶: Prepare tensorboard writers for the current stage.

on_stage_end(runner: catalyst.core.runner.IRunner)[source]¶: Close opened tensorboard writers.

on_stage_start(runner: catalyst.core.runner.IRunner)[source]¶: @TODO: Docs. Contribution is welcome.

class catalyst.core.callbacks.logging.VerboseLogger(always_show: List[str] = None, never_show: List[str] = None)[source]¶

Bases: catalyst.core.callbacks.logging.ILoggerCallback

Logs the params into console.

__init__(always_show: List[str] = None, never_show: List[str] = None)[source]¶

Parameters

always_show (List[str]) – list of metrics to always show if None default is ["_timer/_fps"] to remove always_show metrics set it to an empty list []
never_show (List[str]) – list of metrics which will not be shown

on_batch_end(runner: catalyst.core.runner.IRunner)[source]¶: Update tqdm progress bar at the end of each batch.

on_exception(runner: catalyst.core.runner.IRunner)[source]¶: Called if an Exception was raised.

on_loader_end(runner: catalyst.core.runner.IRunner)[source]¶: Cleanup and close tqdm progress bar.

on_loader_start(runner: catalyst.core.runner.IRunner)[source]¶: Init tqdm progress bar.

Metrics ¶

class catalyst.core.callbacks.metrics.IMetricCallback(prefix: str, input_key: Union[str, List[str], Dict[str, str]] = 'targets', output_key: Union[str, List[str], Dict[str, str]] = 'logits', multiplier: float = 1.0, **metrics_kwargs)[source]¶

Bases: abc.ABC, catalyst.core.callback.Callback

@TODO: Docs. Contribution is welcome.

__init__(prefix: str, input_key: Union[str, List[str], Dict[str, str]] = 'targets', output_key: Union[str, List[str], Dict[str, str]] = 'logits', multiplier: float = 1.0, **metrics_kwargs)[source]¶: @TODO: Docs. Contribution is welcome.

abstract property metric_fn¶

Docs. Contribution is welcome.

Type: @TODO

class catalyst.core.callbacks.metrics.IBatchMetricCallback(prefix: str, input_key: Union[str, List[str], Dict[str, str]] = 'targets', output_key: Union[str, List[str], Dict[str, str]] = 'logits', multiplier: float = 1.0, **metrics_kwargs)[source]¶

Bases: catalyst.core.callbacks.metrics.IMetricCallback

@TODO: Docs. Contribution is welcome.

on_batch_end(runner: catalyst.core.runner.IRunner) → None[source]¶: Computes metrics and add them to batch metrics.

class catalyst.core.callbacks.metrics.ILoaderMetricCallback(**kwargs)[source]¶

Bases: catalyst.core.callbacks.metrics.IMetricCallback

@TODO: Docs. Contribution is welcome.

__init__(**kwargs)[source]¶: @TODO: Docs. Contribution is welcome.

on_batch_end(runner: catalyst.core.runner.IRunner) → None[source]¶: Stores new input/output for the metric computation.

on_loader_end(runner: catalyst.core.runner.IRunner)[source]¶: @TODO: Docs. Contribution is welcome.

on_loader_start(runner: catalyst.core.runner.IRunner)[source]¶: Reinitialises internal storages.

class catalyst.core.callbacks.metrics.BatchMetricCallback(prefix: str, metric_fn: Callable, input_key: Union[str, List[str], Dict[str, str]] = 'targets', output_key: Union[str, List[str], Dict[str, str]] = 'logits', multiplier: float = 1.0, **metric_kwargs)[source]¶

Bases: catalyst.core.callbacks.metrics.IBatchMetricCallback

A callback that returns single metric on runner.on_batch_end.

__init__(prefix: str, metric_fn: Callable, input_key: Union[str, List[str], Dict[str, str]] = 'targets', output_key: Union[str, List[str], Dict[str, str]] = 'logits', multiplier: float = 1.0, **metric_kwargs)[source]¶: @TODO: Docs. Contribution is welcome.

property metric_fn¶

Docs. Contribution is welcome.

Type: @TODO

class catalyst.core.callbacks.metrics.LoaderMetricCallback(prefix: str, metric_fn: Callable, input_key: Union[str, List[str], Dict[str, str]] = 'targets', output_key: Union[str, List[str], Dict[str, str]] = 'logits', multiplier: float = 1.0, **metric_kwargs)[source]¶

Bases: catalyst.core.callbacks.metrics.ILoaderMetricCallback

A callback that returns single metric on runner.on_batch_end.

__init__(prefix: str, metric_fn: Callable, input_key: Union[str, List[str], Dict[str, str]] = 'targets', output_key: Union[str, List[str], Dict[str, str]] = 'logits', multiplier: float = 1.0, **metric_kwargs)[source]¶: @TODO: Docs. Contribution is welcome.

property metric_fn¶

Docs. Contribution is welcome.

Type: @TODO

catalyst.core.callbacks.metrics.MetricCallback¶: alias of catalyst.core.callbacks.metrics.BatchMetricCallback

class catalyst.core.callbacks.metrics.MetricAggregationCallback(prefix: str, metrics: Union[str, List[str], Dict[str, float]] = None, mode: str = 'mean', scope: str = 'batch', multiplier: float = 1.0)[source]¶

Bases: catalyst.core.callback.Callback

A callback to aggregate several metrics in one value.

__init__(prefix: str, metrics: Union[str, List[str], Dict[str, float]] = None, mode: str = 'mean', scope: str = 'batch', multiplier: float = 1.0) → None[source]¶

Parameters

prefix (str) – new key for aggregated metric.
metrics (Union[str, List[str], Dict[str, float]]) – If not None, it aggregates only the values from the metric by these keys. for weighted_sum aggregation it must be a Dict[str, float].
mode (str) – function for aggregation. Must be either sum, mean or weighted_sum.
multiplier (float) – scale factor for the aggregated metric.

on_batch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Computes the metric and add it to the metrics.

Parameters: runner (IRunner) – current runner

on_epoch_end(runner: catalyst.core.runner.IRunner)[source]¶

on_loader_end(runner: catalyst.core.runner.IRunner)[source]¶

class catalyst.core.callbacks.metrics.MetricManagerCallback[source]¶

Bases: catalyst.core.callback.Callback

Prepares metrics for logging, transferring values from PyTorch to numpy.

__init__()[source]¶: @TODO: Docs. Contribution is welcome.

on_batch_end(runner: catalyst.core.runner.IRunner) → None[source]¶: Batch end hook. :param runner: current runner :type runner: IRunner

on_batch_start(runner: catalyst.core.runner.IRunner) → None[source]¶: Batch start hook. :param runner: current runner :type runner: IRunner

on_epoch_start(runner: catalyst.core.runner.IRunner) → None[source]¶: Epoch start hook. :param runner: current runner :type runner: IRunner

on_loader_end(runner: catalyst.core.runner.IRunner) → None[source]¶: Loader end hook. :param runner: current runner :type runner: IRunner

on_loader_start(runner: catalyst.core.runner.IRunner) → None[source]¶: Loader start hook. :param runner: current runner :type runner: IRunner

static to_single_value(value: Any) → float[source]¶: @TODO: Docs. Contribution is welcome.

Optimizer ¶

class catalyst.core.callbacks.optimizer.IOptimizerCallback(order: int, node: int = <CallbackNode.All: 0>, scope: int = <CallbackScope.Stage: 0>)[source]¶

Bases: catalyst.core.callback.Callback

Optimizer callback interface, abstraction over optimizer step.

class catalyst.core.callbacks.optimizer.AMPOptimizerCallback(metric_key: str = None, optimizer_key: str = None, accumulation_steps: int = 1, grad_clip_params: Dict = None, loss_key: str = None)[source]¶

Bases: catalyst.core.callbacks.optimizer.IOptimizerCallback

Optimizer callback with native torch amp support.

__init__(metric_key: str = None, optimizer_key: str = None, accumulation_steps: int = 1, grad_clip_params: Dict = None, loss_key: str = None)[source]¶

Parameters

loss_key (str) – key to get loss from runner.batch_metrics
optimizer_key (str) – A key to take a optimizer in case there are several of them and they are in a dictionary format.
accumulation_steps (int) – number of steps before model.zero_grad()
grad_clip_params (dict) – params for gradient clipping
decouple_weight_decay (bool) – If True - decouple weight decay regularization.

grad_step(*, optimizer: torch.optim.optimizer.Optimizer, grad_clip_fn: Callable = None) → None[source]¶

Makes a gradient step for a given optimizer.

Parameters

optimizer (Optimizer) – the optimizer
grad_clip_fn (Callable) – function for gradient clipping

on_batch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

On batch end event

Parameters: runner (IRunner) – current runner

on_batch_start(runner: catalyst.core.runner.IRunner) → None[source]¶

On batch start event

Parameters: runner (IRunner) – current runner

on_epoch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

On epoch end event.

Parameters: runner (IRunner) – current runner

on_stage_end(runner: catalyst.core.runner.IRunner) → None[source]¶

On stage end event.

Parameters: runner (IRunner) – current runner

on_stage_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Checks that the current stage has correct optimizer.

Parameters: runner (IRunner) – current runner

class catalyst.core.callbacks.optimizer.OptimizerCallback(metric_key: str = None, optimizer_key: str = None, accumulation_steps: int = 1, grad_clip_params: Dict = None, decouple_weight_decay: bool = True, loss_key: str = None, use_fast_zero_grad: bool = False, xla_barrier: bool = True)[source]¶

Bases: catalyst.core.callbacks.optimizer.IOptimizerCallback

Optimizer callback, abstraction over optimizer step.

__init__(metric_key: str = None, optimizer_key: str = None, accumulation_steps: int = 1, grad_clip_params: Dict = None, decouple_weight_decay: bool = True, loss_key: str = None, use_fast_zero_grad: bool = False, xla_barrier: bool = True)[source]¶

Parameters

loss_key (str) – key to get loss from runner.batch_metrics
optimizer_key (str) – A key to take a optimizer in case there are several of them and they are in a dictionary format.
accumulation_steps (int) – number of steps before model.zero_grad()
grad_clip_params (dict) – params for gradient clipping
decouple_weight_decay (bool) – If True - decouple weight decay regularization.
use_fast_zero_grad (bool) – boost optiomizer.zero_grad(), default is False.
xla_barrier (bool) –
barrier option for xla. Here you can find more about usage of barrier flag and examples.

Default is True.

grad_step(*, optimizer: torch.optim.optimizer.Optimizer, optimizer_wds: List[float] = 0, grad_clip_fn: Callable = None) → None[source]¶

Makes a gradient step for a given optimizer.

Parameters

optimizer (Optimizer) – the optimizer
optimizer_wds (List[float]) – list of weight decay parameters for each param group
grad_clip_fn (Callable) – function for gradient clipping

on_batch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

On batch end event

Parameters: runner (IRunner) – current runner

on_epoch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

On epoch end event.

Parameters: runner (IRunner) – current runner

on_epoch_start(runner: catalyst.core.runner.IRunner) → None[source]¶

On epoch start event.

Parameters: runner (IRunner) – current runner

on_stage_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Checks that the current stage has correct optimizer.

Parameters: runner (IRunner) – current runner

PeriodicLoaderCallback ¶

class catalyst.core.callbacks.periodic_loader.PeriodicLoaderCallback(**kwargs)[source]¶

Bases: catalyst.core.callback.Callback

Callback for runing loaders with specified period. To disable loader use 0 as period (if specified 0 for validation loader then will be raised an error).

For example, if you have train, train_additional, valid and valid_additional loaders and wan’t to use train_additional every 2 epochs, valid - every 3 epochs and valid_additional - every 5 epochs:

from catalyst.dl import (
    SupervisedRunner, PeriodicLoaderRunnerCallback,
)
runner = SupervisedRunner()
runner.train(
    ...
    loaders={
        "train": ...,
        "train_additional": ...,
        "valid": ...,
        "valid_additional":...
    }
    ...
    callbacks=[
        ...
        PeriodicLoaderRunnerCallback(
            train_additional=2,
            valid=3,
            valid_additional=5
        ),
        ...
    ]
    ...
)

__init__(**kwargs)[source]¶

Parameters: kwargs – loader names and their run periods.

on_epoch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Check if validation metric should be dropped for current epoch.

Parameters: runner (IRunner) – current runner

on_epoch_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Set loaders for current epoch. If validation is not required then the first loader from loaders used in current epoch will be used as validation loader. Metrics from the latest epoch with true validation loader will be used in the epochs where this loader is missing.

Parameters: runner (IRunner) – current runner
Raises: ValueError – if there are no loaders in epoch

on_stage_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Collect information about loaders.

Parameters: runner (IRunner) – current runner
Raises: ValueError – if there are no loaders in epoch

Scheduler ¶

class catalyst.core.callbacks.scheduler.ISchedulerCallback(order: int, node: int = <CallbackNode.All: 0>, scope: int = <CallbackScope.Stage: 0>)[source]¶

Bases: catalyst.core.callback.Callback

Scheduler callback interface, abstraction over scheduler step.

class catalyst.core.callbacks.scheduler.SchedulerCallback(scheduler_key: str = None, mode: str = None, reduced_metric: str = None)[source]¶

Bases: catalyst.core.callbacks.scheduler.ISchedulerCallback

Callback for wrapping schedulers.

Notebook API example:

import torch
from torch.utils.data import DataLoader, TensorDataset
from catalyst.dl import (
    SupervisedRunner, AccuracyCallback,
    CriterionCallback, SchedulerCallback,
)

num_samples, num_features = 10_000, 10
n_classes = 10
X = torch.rand(num_samples, num_features)
y = torch.randint(0, n_classes, [num_samples])
loader = DataLoader(TensorDataset(X, y), batch_size=32, num_workers=1)
loaders = {"train": loader, "valid": loader}

model = torch.nn.Linear(num_features, n_classes)
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [3, 6])

runner = SupervisedRunner()
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    loaders=loaders,
    logdir="./logdir",
    num_epochs=5,
    verbose=False,
    main_metric="accuracy03",
    minimize_metric=False,
    callbacks=[
        AccuracyCallback(
            accuracy_args=[1, 3, 5]
        ),
        SchedulerCallback(reduced_metric="loss")
    ]
)

Config API usage example:

stages:
  ...
  scheduler_params:
    scheduler: MultiStepLR
    milestones: [1]
    gamma: 0.3
  ...
  stage_N:
    ...
    callbacks_params:
      ...
      scheduler:
        callback: SchedulerCallback
        # arguments for SchedulerCallback
        reduced_metric: loss
  ...

__init__(scheduler_key: str = None, mode: str = None, reduced_metric: str = None)[source]¶

Parameters

scheduler_key (str) – scheduler name, if None, default is None.
mode (str) – scheduler mode, should be one of "epoch" or "batch", default is None. If None and object is instance of BatchScheduler or OneCycleLRWithWarmup then will be used "batch" otherwise - "epoch".
reduced_metric (str) – metric name to forward to scheduler object, if None then will be used main metric specified in experiment.

on_batch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Batch end hook.

Parameters: runner (IRunner) – current runner

on_epoch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Epoch end hook.

Parameters: runner (IRunner) – current runner

on_loader_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Loader start hook.

Parameters: runner (IRunner) – current runner

on_stage_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Stage start hook.

Parameters: runner (IRunner) – current runner

step_batch(runner: catalyst.core.runner.IRunner) → None[source]¶

Update learning rate and momentum in runner.

Parameters: runner (IRunner) – current runner

step_epoch(runner: catalyst.core.runner.IRunner) → None[source]¶

Update momentum in runner.

Parameters: runner (IRunner) – current runner

class catalyst.core.callbacks.scheduler.LRUpdater(optimizer_key: str = None)[source]¶

Bases: abc.ABC, catalyst.core.callback.Callback

Basic class that all Lr updaters inherit from.

__init__(optimizer_key: str = None)[source]¶

Parameters: optimizer_key (str) – which optimizer key to use for learning rate scheduling

abstract calc_lr()[source]¶: Interface for calculating learning rate.

abstract calc_momentum()[source]¶: Interface for calculating momentum

on_batch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Batch end hook.

Parameters: runner (IRunner) – current runner

on_loader_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Loader start hook.

Parameters: runner (IRunner) – current runner

on_stage_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Stage start hook.

Parameters: runner (IRunner) – current runner

update_optimizer(runner: catalyst.core.runner.IRunner) → None[source]¶

Update learning rate and momentum in runner.

Parameters: runner (IRunner) – current runner

Timer ¶

class catalyst.core.callbacks.timer.TimerCallback[source]¶

Bases: catalyst.core.callback.Callback

Logs pipeline execution time.

__init__()[source]¶: @TODO: Docs. Contribution is welcome.

on_batch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Batch end hook.

Parameters: runner (IRunner) – current runner

on_batch_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Batch start hook.

Parameters: runner (IRunner) – current runner

on_loader_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Loader end hook.

Parameters: runner (IRunner) – current runner

on_loader_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Loader start hook.

Parameters: runner (IRunner) – current runner

Validation ¶

class catalyst.core.callbacks.validation.ValidationManagerCallback[source]¶

Bases: catalyst.core.callback.Callback

A callback to aggregate runner.valid_metrics from runner.epoch_metrics.

__init__()[source]¶: @TODO: Docs. Contribution is welcome.

on_epoch_end(runner: catalyst.core.runner.IRunner) → None[source]¶

Epoch end hook.

Parameters: runner (IRunner) – current runner

on_epoch_start(runner: catalyst.core.runner.IRunner) → None[source]¶

Epoch start hook.

Parameters: runner (IRunner) – current runner

Registry ¶

Utils ¶

catalyst.core.utils.callbacks.sort_callbacks_by_order(callbacks: Union[List, Dict, collections.OrderedDict]) → collections.OrderedDict[source]¶

Creates an sequence of callbacks and sort them.

Parameters: callbacks – either list of callbacks or ordered dict
Returns: sequence of callbacks sorted by callback order
Raises: TypeError – if callbacks is out of None, dict, OrderedDict, list

catalyst.core.utils.callbacks.filter_callbacks_by_node(callbacks: Union[Dict, collections.OrderedDict]) → Union[Dict, collections.OrderedDict][source]¶

Filters callbacks based on running node. Deletes worker-only callbacks from CallbackNode.Master and master-only callbacks from CallbackNode.Worker.

Parameters: callbacks (Union[Dict, OrderedDict]) – callbacks
Returns: filtered callbacks dictionary.
Return type: Union[Dict, OrderedDict]

Legacy ¶

Runner ¶

class catalyst.core.legacy.IRunnerLegacy[source]¶

Bases: object

Special class to encapsulate all catalyst.core.runner.IRunner and catalyst.core.runner.State legacy into one place. Used to make catalyst.core.runner.IRunner cleaner and easier to understand.

Saved for backward compatibility. Should be removed someday.

property batch_in¶: Alias for runner.input.

Warning

Deprecated, saved for backward compatibility. Please use runner.input instead.

property batch_out¶: Alias for runner.output.

Warning

Deprecated, saved for backward compatibility. Please use runner.output instead.

property loader_step¶: Alias for runner.loader_batch_step.

Warning

Deprecated, saved for backward compatibility. Please use runner.loader_batch_step instead.

property need_backward_pass¶: Alias for runner.is_train_loader.

Warning

Deprecated, saved for backward compatibility. Please use runner.is_train_loader instead.

property state¶: Alias for runner.

Warning

Deprecated, saved for backward compatibility. Please use runner instead.