Shortcuts

Runners

ISupervisedRunner

class catalyst.runners.supervised.ISupervisedRunner(input_key: Any = 'features', output_key: Any = 'logits', target_key: str = 'targets', loss_key: str = 'loss')[source]

Bases: catalyst.runners.runner.Runner

IRunner for experiments with supervised model.

Parameters
  • input_key – key in runner.batch dict mapping for model input

  • output_key – key for runner.batch to store model output

  • target_key – key in runner.batch dict mapping for target

  • loss_key – key for runner.batch_metrics to store criterion loss output

Abstraction, please check out implementations for more details:

  • catalyst.runners.runner.SupervisedRunner

Note

ISupervisedRunner contains only the logic with batch handling.

ISupervisedRunner logic pseudocode:

batch = {"input_key": tensor, "target_key": tensor}
output = model(batch["input_key"])
batch["output_key"] = output
loss = criterion(batch["output_key"], batch["target_key"])
batch_metrics["loss_key"] = loss

Note

Please follow the minimal examples sections for use cases.

forward(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Forward method for your Runner. Should not be called directly outside of runner. If your model has specific interface, override this method to use it

Parameters
  • batch (Mapping[str, Any]) – dictionary with data batches from DataLoaders.

  • **kwargs – additional parameters to pass to the model

Returns

dict with model output batch

handle_batch(batch: Mapping[str, Any]) → None[source]

Inner method to handle specified data batch. Used to make a train/valid/infer step during Experiment run.

Parameters

batch – dictionary with data batches from DataLoader.

Runner

class catalyst.runners.runner.Runner(*args, **kwargs)[source]

Bases: catalyst.core.runner.IRunner

Single-stage deep learning Runner with user-friendly API.

Runner supports the logic for deep learning pipeline configuration with pure python code. Please check the examples for intuition.

Parameters
  • *argsIRunner args (model, engine)

  • **kwargsIRunner kwargs (model, engine)

Note

IRunner supports only base user-friendly callbacks, like TqdmCallback, TimerCallback, CheckRunCallback, BatchOverfitCallback, and CheckpointCallback.

It does not automatically add Criterion, Optimizer or Scheduler callbacks.

That means, that you have do optimization step by yourself during handle_batch method or specify the required callbacks in .train or get_callbacks methods.

For more easy-to-go supervised use case please follow catalyst.runners.runner.SupervisedRunner.

Note

Please follow the minimal examples sections for use cases.

Examples:

import os
from torch import nn, optim
from torch.nn import functional as F
from torch.utils.data import DataLoader
from catalyst import dl, metrics
from catalyst.contrib.datasets import MNIST

model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10))
optimizer = optim.Adam(model.parameters(), lr=0.02)
loaders = {
    "train": DataLoader(MNIST(os.getcwd(), train=True), batch_size=32),
    "valid": DataLoader(MNIST(os.getcwd(), train=False), batch_size=32),
}

class CustomRunner(dl.Runner):
    def predict_batch(self, batch):
        # model inference step
        return self.model(batch[0].to(self.device))

    def on_loader_start(self, runner):
        super().on_loader_start(runner)
        self.meters = {
            key: metrics.AdditiveMetric(compute_on_call=False)
            for key in ["loss", "accuracy01", "accuracy03"]
        }

    def handle_batch(self, batch):
        # model train/valid step
        # unpack the batch
        x, y = batch
        # run model forward pass
        logits = self.model(x)
        # compute the loss
        loss = F.cross_entropy(logits, y)
        # compute other metrics of interest
        accuracy01, accuracy03 = metrics.accuracy(logits, y, topk=(1, 3))
        # log metrics
        self.batch_metrics.update(
            {"loss": loss, "accuracy01": accuracy01, "accuracy03": accuracy03}
        )
        for key in ["loss", "accuracy01", "accuracy03"]:
            self.meters[key].update(
                self.batch_metrics[key].item(), self.batch_size
            )
        # run model backward pass
        if self.is_train_loader:
            self.engine.backward(loss)
            self.optimizer.step()
            self.optimizer.zero_grad()

    def on_loader_end(self, runner):
        for key in ["loss", "accuracy01", "accuracy03"]:
            self.loader_metrics[key] = self.meters[key].compute()[0]
        super().on_loader_end(runner)

runner = CustomRunner()
# model training
runner.train(
    model=model,
    optimizer=optimizer,
    loaders=loaders,
    logdir="./logs",
    num_epochs=5,
    verbose=True,
    valid_loader="valid",
    valid_metric="loss",
    minimize_valid_metric=True,
)
# model inference
for logits in runner.predict_loader(loader=loaders["valid"]):
    assert logits.detach().cpu().numpy().shape[-1] == 10
evaluate_loader(loader: torch.utils.data.dataloader.DataLoader, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, model: Optional[torch.nn.modules.module.Module] = None, engine: Union[Engine, str] = None, seed: int = 42, verbose: bool = False) → Dict[str, Any][source]

Evaluates dataloader with given model and returns obtained metrics.

Parameters
  • loader – loader to predict

  • callbacks – list or dictionary with catalyst callbacks

  • model – model, compatible with current runner. If None simply takes current model from runner.

  • engine – engine to use for model evaluation

  • seed – random seed to use before prediction

  • verbose – if True, it displays the status of the evaluation to the console.

Returns

Dict with metrics counted on the loader.

Raises

IRunnerError – if CheckpointCallback found in the callbacks

get_callbacks() → OrderedDict[str, Callback][source]

Returns the callbacks for the experiment.

get_criterion() → Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module], None][source]

Returns the criterion for the experiment.

get_engine() → catalyst.core.engine.Engine[source]

Returns the engine for the experiment.

get_loaders() → OrderedDict[str, DataLoader][source]

Returns the loaders for the experiment.

get_loggers() → Dict[str, catalyst.core.logger.ILogger][source]

Returns the loggers for the experiment.

get_model() → Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]][source]

Returns the model for the experiment.

get_optimizer(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]]) → Union[torch.optim.optimizer.Optimizer, Dict[str, torch.optim.optimizer.Optimizer], None][source]

Returns the optimizer for the experiment.

get_scheduler(optimizer: Union[torch.optim.optimizer.Optimizer, Dict[str, torch.optim.optimizer.Optimizer]]) → Union[torch.optim.lr_scheduler._LRScheduler, torch.optim.lr_scheduler.ReduceLROnPlateau, Dict[str, Union[torch.optim.lr_scheduler._LRScheduler, torch.optim.lr_scheduler.ReduceLROnPlateau]], None][source]

Returns the scheduler for the experiment.

property hparams

Returns hyperparameters.

property num_epochs

Returns the number of epochs in the experiment.

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Run model inference on specified data batch.

Parameters
  • batch – dictionary with data batches from DataLoader.

  • **kwargs – additional kwargs to pass to the model

Returns: # noqa: DAR202

Mapping: model output dictionary

Raises

NotImplementedError – if not implemented yet

predict_loader(*, loader: torch.utils.data.dataloader.DataLoader, model: torch.nn.modules.module.Module = None, engine: Union[Engine, str] = None, seed: int = 42, resume: str = None, cpu: bool = False, fp16: bool = False) → Generator[source]

Runs model inference on PyTorch DataLoader and returns python generator with model predictions from runner.predict_batch.

Parameters
  • loader – loader to predict

  • model – model to use for prediction

  • engine – engine to use for prediction

  • seed – random seed to use before prediction

  • resume – path to checkpoint for model

  • cpu – boolean flag to force CPU usage

  • fp16 – boolean flag to use half-precision

Yields

bathes with model predictions

Note

Please follow the minimal examples sections for use cases.

property seed

Experiment’s initial seed value.

train(*, loaders: OrderedDict[str, DataLoader], model: torch.nn.modules.module.Module = None, engine: Union[Engine, str] = None, criterion: torch.nn.modules.module.Module = None, optimizer: torch.optim.optimizer.Optimizer = None, scheduler: (<class 'torch.optim.lr_scheduler._LRScheduler'>, <class 'torch.optim.lr_scheduler.ReduceLROnPlateau'>) = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, loggers: Dict[str, ILogger] = None, seed: int = 42, hparams: Dict[str, Any] = None, num_epochs: int = 1, logdir: str = None, resume: str = None, valid_loader: str = None, valid_metric: str = None, minimize_valid_metric: bool = None, verbose: bool = False, timeit: bool = False, check: bool = False, overfit: bool = False, profile: bool = False, load_best_on_end: bool = False, cpu: bool = False, fp16: bool = False, ddp: bool = False) → None[source]

Starts the training of the model.

Parameters
  • loaders – dictionary with one or several torch.utils.data.DataLoader for training, validation or inference

  • model – model to train

  • engine – engine to use for model training

  • criterion – criterion function for training

  • optimizer – optimizer for training

  • scheduler – scheduler for training

  • callbacks – list or dictionary with Catalyst callbacks

  • loggers – dictionary with Catalyst loggers

  • seed – experiment’s initial seed value

  • hparams – hyperparameters for the run

  • num_epochs – number of training epochs

  • logdir – path to output directory

  • resume – path to checkpoint for model

  • valid_loader – loader name used to calculate the metrics and save the checkpoints. For example, you can pass train and then the metrics will be taken from train loader.

  • valid_metric – the key to the name of the metric by which the checkpoints will be selected.

  • minimize_valid_metric – flag to indicate whether the valid_metric should be minimized or not (default: True).

  • verbose – if True, it displays the status of the training to the console.

  • timeit – if True, computes the execution time of training process and displays it to the console.

  • check – if True, then only checks that pipeline is working (3 epochs only with 3 batches per loader)

  • overfit – if True, then takes only one batch per loader for model overfitting, for advance usage please check BatchOverfitCallback

  • profile – if True, then uses ProfilerCallback, for advance usage please check ProfilerCallback

  • load_best_on_end – if True, Runner will load best checkpoint state (model, optimizer, etc) according to validation metrics. Requires specified logdir.

  • cpu – boolean flag to force CPU usage

  • fp16 – boolean flag to use half-precision

  • ddp – if True will start training in distributed mode. Note: Works only with python scripts. No jupyter support.

Note

Please follow the minimal examples sections for use cases.

SupervisedRunner

class catalyst.runners.supervised.SupervisedRunner(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] = None, engine: catalyst.core.engine.Engine = None, input_key: Any = 'features', output_key: Any = 'logits', target_key: str = 'targets', loss_key: str = 'loss')[source]

Bases: catalyst.runners.supervised.ISupervisedRunner, catalyst.runners.runner.Runner

Runner for experiments with supervised model.

Parameters
  • model – Torch model instance

  • engine – Engine instance

  • input_key – key in runner.batch dict mapping for model input

  • output_key – key for runner.batch to store model output

  • target_key – key in runner.batch dict mapping for target

  • loss_key – key for runner.batch_metrics to store criterion loss output

SupervisedRunner logic pseudocode:

batch = {"input_key": tensor, "target_key": tensor}
output = model(batch["input_key"])
batch["output_key"] = output
loss = criterion(batch["output_key"], batch["target_key"])
batch_metrics["loss_key"] = loss

Note

Please follow the minimal examples sections for use cases.

get_callbacks() → OrderedDict[str, Callback][source]

Returns the callbacks for the experiment.

predict_batch(batch: Mapping[str, Any], **kwargs) → Mapping[str, Any][source]

Run model inference on specified data batch.

Warning

You should not override this method. If you need specific model call, override runner.forward() method.

Parameters
  • batch – dictionary with data batch from DataLoader.

  • **kwargs – additional kwargs to pass to the model

Returns

model output dictionary

Return type

Mapping[str, Any]