Runners¶
Runner Extensions¶
ISupervisedRunner¶
- class catalyst.runners.supervised.ISupervisedRunner(input_key: Any = 'features', output_key: Any = 'logits', target_key: str = 'targets', loss_key: str = 'loss')[source]¶
Bases:
catalyst.core.runner.IRunner
IRunner for experiments with supervised model.
- Parameters
input_key – key in
runner.batch
dict mapping for model inputoutput_key – key for
runner.batch
to store model outputtarget_key – key in
runner.batch
dict mapping for targetloss_key – key for
runner.batch_metrics
to store criterion loss output
Abstraction, please check out implementations for more details:
Note
ISupervisedRunner contains only the logic with batch handling.
ISupervisedRunner logic pseudocode:
batch = {"input_key": tensor, "target_key": tensor} output = model(batch["input_key"]) batch["output_key"] = output loss = criterion(batch["output_key"], batch["target_key"]) batch_metrics["loss_key"] = loss
Note
Please follow the minimal examples sections for use cases.
Examples:
import os from torch import nn, optim from torch.utils.data import DataLoader from catalyst import dl, utils from catalyst.data import ToTensor from catalyst.contrib.datasets import MNIST model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10)) criterion = nn.CrossEntropyLoss() optimizer = optim.Adam(model.parameters(), lr=0.02) loaders = { "train": DataLoader( MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32 ), "valid": DataLoader( MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32 ), } runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) # model training runner.train( model=model, criterion=criterion, optimizer=optimizer, loaders=loaders, num_epochs=1, callbacks=[ dl.AccuracyCallback(input_key="logits", target_key="targets", topk_args=(1, 3)), dl.PrecisionRecallF1SupportCallback( input_key="logits", target_key="targets", num_classes=10 ), dl.AUCCallback(input_key="logits", target_key="targets"), ], logdir="./logs", valid_loader="valid", valid_metric="loss", minimize_valid_metric=True, verbose=True, load_best_on_end=True, ) # model inference for prediction in runner.predict_loader(loader=loaders["valid"]): assert prediction["logits"].detach().cpu().numpy().shape[-1] == 10
- __init__(input_key: Any = 'features', output_key: Any = 'logits', target_key: str = 'targets', loss_key: str = 'loss')[source]¶
Init.
- forward(batch: Mapping[str, Any], **kwargs) Mapping[str, Any] [source]¶
Forward method for your Runner. Should not be called directly outside of runner. If your model has specific interface, override this method to use it
- Parameters
batch (Mapping[str, Any]) – dictionary with data batches from DataLoaders.
**kwargs – additional parameters to pass to the model
- Returns
dict with model output batch
ISelfSupervisedRunner¶
- class catalyst.runners.self_supervised.ISelfSupervisedRunner(input_key: str = 'features', target_key: str = 'target', loss_key: str = 'loss', augemention_prefix: str = 'augment', projection_prefix: str = 'projection', embedding_prefix: str = 'embedding')[source]¶
Bases:
catalyst.core.runner.IRunner
IRunner for experiments with contrastive model.
- Parameters
input_key – key in
runner.batch
dict mapping for model inputtarget_key – key in
runner.batch
dict mapping for targetloss_key – key for
runner.batch_metrics
to store criterion loss outputaugemention_prefix – key for
runner.batch
to sample augumentionsprojection_prefix – key for
runner.batch
to store model projectionembedding_prefix – key for runner.batch` to store model embeddings
Abstraction, please check out implementations for more details:
catalyst.runners.contrastive.ContrastiveRunner
Note
ISelfSupervisedRunner contains only the logic with batch handling.
ISelfSupervisedRunner logic pseudocode:
batch = {"aug1": tensor, "aug2": tensor, ...} _, proj1 = model(batch["aug1"]) _, proj2 = model(batch["aug2"]) loss = criterion(proj1, proj2) batch_metrics["loss_key"] = loss
Examples:
# 1. loader and transforms transforms = Compose( [ ToTensor(), Normalize((0.1307,), (0.3081,)), torchvision.transforms.RandomCrop((28, 28)), torchvision.transforms.RandomVerticalFlip(), torchvision.transforms.RandomHorizontalFlip(), ] ) mnist = MNIST("./logdir", train=True, download=True, transform=None) contrastive_mnist = ContrastiveDataset(mnist, transforms=transforms) train_loader = torch.utils.data.DataLoader(contrastive_mnist, batch_size=BATCH_SIZE) # 2. model and optimizer encoder = MnistSimpleNet(out_features=16) projection_head = nn.Sequential( nn.Linear(16, 16, bias=False), nn.ReLU(inplace=True), nn.Linear(16, 16, bias=True) ) class ContrastiveModel(torch.nn.Module): def __init__(self, model, encoder): super(ContrastiveModel, self).__init__() self.model = model self.encoder = encoder def forward(self, x): emb = self.encoder(x) projection = self.model(emb) return emb, projection model = ContrastiveModel(model=projection_head, encoder=encoder) optimizer = Adam(model.parameters(), lr=LR) # 3. criterion with triplets sampling criterion = NTXentLoss(tau=0.1) callbacks = [ dl.ControlFlowCallback( dl.CriterionCallback( input_key="projection_left", target_key="projection_right", metric_key="loss" ), loaders="train", ), dl.SklearnModelCallback( feature_key="embedding_left", target_key="target", train_loader="train", valid_loaders="valid", model_fn=RandomForestClassifier, predict_method="predict_proba", predict_key="sklearn_predict", random_state=RANDOM_STATE, n_estimators=10, ), dl.ControlFlowCallback( dl.AccuracyCallback( target_key="target", input_key="sklearn_predict", topk_args=(1, 3) ), loaders="valid", ), ] runner = dl.ContrastiveRunner() logdir = "./logdir" runner.train( model=model, engine=engine or dl.DeviceEngine(device), criterion=criterion, optimizer=optimizer, callbacks=callbacks, loaders={"train": train_loader, "valid": train_loader}, verbose=True, logdir=logdir, valid_loader="train", valid_metric="loss", minimize_valid_metric=True, num_epochs=10, )
Note
Please follow the minimal examples sections for use cases.
- __init__(input_key: str = 'features', target_key: str = 'target', loss_key: str = 'loss', augemention_prefix: str = 'augment', projection_prefix: str = 'projection', embedding_prefix: str = 'embedding')[source]¶
Init.
- forward(batch: Mapping[str, Any], **kwargs) Mapping[str, Any] [source]¶
Forward method for your Runner. Should not be called directly outside of runner. If your model has specific interface, override this method to use it
- Parameters
batch (Mapping[str, Any]) – dictionary with data batches from DataLoaders.
**kwargs – additional parameters to pass to the model
- Returns
dict with model output batch
Python API¶
Runner¶
- class catalyst.runners.runner.Runner(*args, **kwargs)[source]¶
Bases:
catalyst.core.runner.IRunner
Single-stage deep learning Runner with user-friendly API.
Runner supports the logic for deep learning pipeline configuration with pure python code. Please check the examples for intuition.
- Parameters
*args – IRunner args (model, engine)
**kwargs – IRunner kwargs (model, engine)
Note
IRunner supports only base user-friendly callbacks, like TqdmCallback, TimerCallback, CheckRunCallback, BatchOverfitCallback, and CheckpointCallback.
It does not automatically add Criterion, Optimizer or Scheduler callbacks.
That means, that you have do optimization step by yourself during
handle_batch
method or specify the required callbacks in.train
orget_callbacks
methods.For more easy-to-go supervised use case please follow
catalyst.runners.runner.SupervisedRunner
.Note
Please follow the minimal examples sections for use cases.
Examples:
import os from torch import nn, optim from torch.nn import functional as F from torch.utils.data import DataLoader from catalyst import dl, metrics from catalyst.data import ToTensor from catalyst.contrib.datasets import MNIST model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10)) optimizer = optim.Adam(model.parameters(), lr=0.02) loaders = { "train": DataLoader( MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32 ), "valid": DataLoader( MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32 ), } class CustomRunner(dl.Runner): def predict_batch(self, batch): # model inference step return self.model(batch[0].to(self.device)) def on_loader_start(self, runner): super().on_loader_start(runner) self.meters = { key: metrics.AdditiveMetric(compute_on_call=False) for key in ["loss", "accuracy01", "accuracy03"] } def handle_batch(self, batch): # model train/valid step # unpack the batch x, y = batch # run model forward pass logits = self.model(x) # compute the loss loss = F.cross_entropy(logits, y) # compute other metrics of interest accuracy01, accuracy03 = metrics.accuracy(logits, y, topk=(1, 3)) # log metrics self.batch_metrics.update( {"loss": loss, "accuracy01": accuracy01, "accuracy03": accuracy03} ) for key in ["loss", "accuracy01", "accuracy03"]: self.meters[key].update(self.batch_metrics[key].item(), self.batch_size) # run model backward pass if self.is_train_loader: loss.backward() self.optimizer.step() self.optimizer.zero_grad() def on_loader_end(self, runner): for key in ["loss", "accuracy01", "accuracy03"]: self.loader_metrics[key] = self.meters[key].compute()[0] super().on_loader_end(runner) runner = CustomRunner() # model training runner.train( model=model, optimizer=optimizer, loaders=loaders, logdir="./logs", num_epochs=5, verbose=True, valid_loader="valid", valid_metric="loss", minimize_valid_metric=True, ) # model inference for logits in runner.predict_loader(loader=loaders["valid"]): assert logits.detach().cpu().numpy().shape[-1] == 10
- evaluate_loader(loader: torch.utils.data.dataloader.DataLoader, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, model: Optional[torch.nn.modules.module.Module] = None, seed: int = 42, verbose: bool = False) Dict[str, Any] [source]¶
Evaluates data from loader with given model and returns obtained metrics. # noqa: DAR401
- Parameters
loader – loader to predict
callbacks – list or dictionary with catalyst callbacks
model – model, compatible with current runner. If None simply takes current model from runner.
seed – random seed to use before prediction
verbose – if True, it displays the status of the evaluation to the console.
- Returns
Dict with metrics counted on the loader.
- get_callbacks(stage: str) OrderedDict[str, Callback] [source]¶
Returns the callbacks for a given stage.
- get_criterion(stage: str) torch.nn.modules.module.Module [source]¶
Returns the criterion for a given stage.
- get_engine() catalyst.core.engine.IEngine [source]¶
Returns the engine for a run.
- get_loaders(stage: str) OrderedDict[str, DataLoader] [source]¶
Returns the loaders for a given stage.
- get_loggers() Dict[str, catalyst.core.logger.ILogger] [source]¶
Returns the logger for a run.
- get_optimizer(stage: str, model: torch.nn.modules.module.Module) torch.optim.optimizer.Optimizer [source]¶
Returns the optimizer for a given stage.
- get_scheduler(stage: str, optimizer: torch.optim.optimizer.Optimizer) torch.optim.lr_scheduler._LRScheduler [source]¶
Returns the scheduler for a given stage.
- get_trial() catalyst.core.trial.ITrial [source]¶
Returns the trial for a run.
- property hparams: Dict¶
Returns hyperparameters.
- property name: str¶
Returns run name.
- predict_batch(batch: Mapping[str, Any], **kwargs) Mapping[str, Any] [source]¶
Run model inference on specified data batch.
- Parameters
batch – dictionary with data batches from DataLoader.
**kwargs – additional kwargs to pass to the model
- Returns
model output dictionary
- Return type
Mapping
- Raises
NotImplementedError – if not implemented yet
- predict_loader(*, loader: torch.utils.data.dataloader.DataLoader, model: torch.nn.modules.module.Module = None, engine: Union[IEngine, str] = None, seed: int = 42, fp16: bool = False, amp: bool = False, apex: bool = False, ddp: bool = False) Generator [source]¶
Runs model inference on PyTorch DataLoader and returns python generator with model predictions from runner.predict_batch.
- Parameters
loader – loader to predict
model – model to use for prediction
engine – engine to use for prediction
seed – random seed to use before prediction
fp16 – boolean flag to use half-precision training (AMP > APEX)
amp – boolean flag to use amp half-precision
apex – boolean flag to use apex half-precision
ddp – if True will start training in distributed mode. Note: Works only with python scripts. No jupyter support.
- Yields
bathes with model predictions
Note
Please follow the minimal examples sections for use cases.
Examples:
import os from torch import nn, optim from torch.nn import functional as F from torch.utils.data import DataLoader from catalyst import dl, metrics from catalyst.data import ToTensor from catalyst.contrib.datasets import MNIST model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10)) optimizer = optim.Adam(model.parameters(), lr=0.02) loaders = { "train": DataLoader( MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32 ), "valid": DataLoader( MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32 ), } class CustomRunner(dl.Runner): def predict_batch(self, batch): # model inference step return self.model(batch[0].to(self.device)) def on_loader_start(self, runner): super().on_loader_start(runner) self.meters = { key: metrics.AdditiveMetric(compute_on_call=False) for key in ["loss", "accuracy01", "accuracy03"] } def handle_batch(self, batch): # model train/valid step # unpack the batch x, y = batch # run model forward pass logits = self.model(x) # compute the loss loss = F.cross_entropy(logits, y) # compute other metrics of interest accuracy01, accuracy03 = metrics.accuracy(logits, y, topk=(1, 3)) # log metrics self.batch_metrics.update( {"loss": loss, "accuracy01": accuracy01, "accuracy03": accuracy03} ) for key in ["loss", "accuracy01", "accuracy03"]: self.meters[key].update( self.batch_metrics[key].item(), self.batch_size ) # run model backward pass if self.is_train_loader: loss.backward() self.optimizer.step() self.optimizer.zero_grad() def on_loader_end(self, runner): for key in ["loss", "accuracy01", "accuracy03"]: self.loader_metrics[key] = self.meters[key].compute()[0] super().on_loader_end(runner) runner = CustomRunner() # model training runner.train( model=model, optimizer=optimizer, loaders=loaders, logdir="./logs", num_epochs=5, verbose=True, valid_loader="valid", valid_metric="loss", minimize_valid_metric=True, ) # model inference for logits in runner.predict_loader(loader=loaders["valid"]): assert logits.detach().cpu().numpy().shape[-1] == 10
- property seed: int¶
Experiment’s initial seed value.
- property stages: Iterable[str]¶
Experiment’s stage names (array with one value).
- train(*, loaders: OrderedDict[str, DataLoader], model: torch.nn.modules.module.Module, engine: Union[IEngine, str] = None, trial: catalyst.core.trial.ITrial = None, criterion: torch.nn.modules.module.Module = None, optimizer: torch.optim.optimizer.Optimizer = None, scheduler: torch.optim.lr_scheduler._LRScheduler = None, callbacks: Union[List[Callback], OrderedDict[str, Callback]] = None, loggers: Dict[str, ILogger] = None, seed: int = 42, hparams: Dict[str, Any] = None, num_epochs: int = 1, logdir: str = None, valid_loader: str = None, valid_metric: str = None, minimize_valid_metric: bool = True, verbose: bool = False, timeit: bool = False, check: bool = False, overfit: bool = False, load_best_on_end: bool = False, fp16: bool = False, amp: bool = False, apex: bool = False, ddp: bool = False) None [source]¶
Starts the train stage of the model.
- Parameters
loaders – dictionary with one or several
torch.utils.data.DataLoader
for training, validation or inferencemodel – model to train
engine – engine to use for model training
trial – trial to use during model training
criterion – criterion function for training
optimizer – optimizer for training
scheduler – scheduler for training
callbacks – list or dictionary with Catalyst callbacks
loggers – dictionary with Catalyst loggers
seed – experiment’s initial seed value
hparams – hyperparameters for the run
num_epochs – number of training epochs
logdir – path to output directory
valid_loader – loader name used to calculate the metrics and save the checkpoints. For example, you can pass train and then the metrics will be taken from train loader.
valid_metric – the key to the name of the metric by which the checkpoints will be selected.
minimize_valid_metric – flag to indicate whether the
valid_metric
should be minimized or not (default: True).verbose – if True, it displays the status of the training to the console.
timeit – if True, computes the execution time of training process and displays it to the console.
check – if True, then only checks that pipeline is working (3 epochs only with 3 batches per loader)
overfit – if True, then takes only one batch per loader for model overfitting, for advance usage please check
BatchOverfitCallback
load_best_on_end – if True, Runner will load best checkpoint state (model, optimizer, etc) according to validation metrics. Requires specified
logdir
.fp16 – boolean flag to use half-precision training (AMP > APEX)
amp – boolean flag to use amp half-precision
apex – boolean flag to use apex half-precision
ddp – if True will start training in distributed mode. Note: Works only with python scripts. No jupyter support.
Note
Please follow the minimal examples sections for use cases.
Examples:
import os from torch import nn, optim from torch.nn import functional as F from torch.utils.data import DataLoader from catalyst import dl, metrics from catalyst.data import ToTensor from catalyst.contrib.datasets import MNIST model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10)) optimizer = optim.Adam(model.parameters(), lr=0.02) loaders = { "train": DataLoader( MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32 ), "valid": DataLoader( MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32 ), } class CustomRunner(dl.Runner): def predict_batch(self, batch): # model inference step return self.model(batch[0].to(self.device)) def on_loader_start(self, runner): super().on_loader_start(runner) self.meters = { key: metrics.AdditiveMetric(compute_on_call=False) for key in ["loss", "accuracy01", "accuracy03"] } def handle_batch(self, batch): # model train/valid step # unpack the batch x, y = batch # run model forward pass logits = self.model(x) # compute the loss loss = F.cross_entropy(logits, y) # compute other metrics of interest accuracy01, accuracy03 = metrics.accuracy(logits, y, topk=(1, 3)) # log metrics self.batch_metrics.update( {"loss": loss, "accuracy01": accuracy01, "accuracy03": accuracy03} ) for key in ["loss", "accuracy01", "accuracy03"]: self.meters[key].update( self.batch_metrics[key].item(), self.batch_size ) # run model backward pass if self.is_train_loader: loss.backward() self.optimizer.step() self.optimizer.zero_grad() def on_loader_end(self, runner): for key in ["loss", "accuracy01", "accuracy03"]: self.loader_metrics[key] = self.meters[key].compute()[0] super().on_loader_end(runner) runner = CustomRunner() # model training runner.train( model=model, optimizer=optimizer, loaders=loaders, logdir="./logs", num_epochs=5, verbose=True, valid_loader="valid", valid_metric="loss", minimize_valid_metric=True, ) # model inference for logits in runner.predict_loader(loader=loaders["valid"]): assert logits.detach().cpu().numpy().shape[-1] == 10
SupervisedRunner¶
- class catalyst.runners.runner.SupervisedRunner(model: Optional[Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]]] = None, engine: Optional[catalyst.core.engine.IEngine] = None, input_key: Any = 'features', output_key: Any = 'logits', target_key: str = 'targets', loss_key: str = 'loss')[source]¶
Bases:
catalyst.runners.supervised.ISupervisedRunner
,catalyst.runners.runner.Runner
Runner for experiments with supervised model.
- Parameters
model – Torch model instance
engine – IEngine instance
input_key – key in
runner.batch
dict mapping for model inputoutput_key – key for
runner.batch
to store model outputtarget_key – key in
runner.batch
dict mapping for targetloss_key – key for
runner.batch_metrics
to store criterion loss output
Note
Please follow the minimal examples sections for use cases.
Examples:
import os from torch import nn, optim from torch.utils.data import DataLoader from catalyst import dl, utils from catalyst.data import ToTensor from catalyst.contrib.datasets import MNIST model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10)) criterion = nn.CrossEntropyLoss() optimizer = optim.Adam(model.parameters(), lr=0.02) loaders = { "train": DataLoader( MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32 ), "valid": DataLoader( MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32 ), } runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) # model training runner.train( model=model, criterion=criterion, optimizer=optimizer, loaders=loaders, num_epochs=1, callbacks=[ dl.AccuracyCallback(input_key="logits", target_key="targets", topk_args=(1, 3)), dl.PrecisionRecallF1SupportCallback( input_key="logits", target_key="targets", num_classes=10 ), dl.AUCCallback(input_key="logits", target_key="targets"), ], logdir="./logs", valid_loader="valid", valid_metric="loss", minimize_valid_metric=True, verbose=True, load_best_on_end=True, ) # model inference for prediction in runner.predict_loader(loader=loaders["valid"]): assert prediction["logits"].detach().cpu().numpy().shape[-1] == 10
- __init__(model: Optional[Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]]] = None, engine: Optional[catalyst.core.engine.IEngine] = None, input_key: Any = 'features', output_key: Any = 'logits', target_key: str = 'targets', loss_key: str = 'loss')[source]¶
Init.
- get_callbacks(stage: str) OrderedDict[str, Callback] [source]¶
Prepares the callbacks for selected stage.
- Parameters
stage – stage name
- Returns
dictionary with stage callbacks
- predict_batch(batch: Mapping[str, Any], **kwargs) Mapping[str, Any] [source]¶
Run model inference on specified data batch.
Warning
You should not override this method. If you need specific model call, override forward() method
- Parameters
batch – dictionary with data batch from DataLoader.
**kwargs – additional kwargs to pass to the model
- Returns
model output dictionary
- Return type
Mapping[str, Any]
SelfSupervisedRunner¶
- class catalyst.runners.runner.SelfSupervisedRunner(model: Optional[Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]]] = None, engine: Optional[catalyst.core.engine.IEngine] = None, input_key: str = 'features', target_key: str = 'target', loss_key: str = 'loss', augemention_prefix: str = 'augment', projection_prefix: str = 'projection', embedding_prefix: str = 'embedding', loss_mode_prefix: str = 'projection')[source]¶
Bases:
catalyst.runners.self_supervised.ISelfSupervisedRunner
,catalyst.runners.runner.Runner
Runner for experiments with contrastive model.
- Parameters
input_key – key in
runner.batch
dict mapping for model inputtarget_key – key in
runner.batch
dict mapping for targetloss_key – key for
runner.batch_metrics
to store criterion loss outputaugemention_prefix – key for
runner.batch
to sample augumentionsprojection_prefix – key for
runner.batch
to store model projectionembedding_prefix – key for runner.batch` to store model embeddings
loss_mode_prefix – selector key for loss calculation
Examples:
# 1. loader and transforms transforms = Compose( [ ToTensor(), Normalize((0.1307,), (0.3081,)), torchvision.transforms.RandomCrop((28, 28)), torchvision.transforms.RandomVerticalFlip(), torchvision.transforms.RandomHorizontalFlip(), ] ) mnist = MNIST("./logdir", train=True, download=True, transform=None) contrastive_mnist = ContrastiveDataset(mnist, transforms=transforms) train_loader = torch.utils.data.DataLoader(contrastive_mnist, batch_size=BATCH_SIZE) # 2. model and optimizer encoder = MnistSimpleNet(out_features=16) projection_head = nn.Sequential( nn.Linear(16, 16, bias=False), nn.ReLU(inplace=True), nn.Linear(16, 16, bias=True) ) class ContrastiveModel(torch.nn.Module): def __init__(self, model, encoder): super(ContrastiveModel, self).__init__() self.model = model self.encoder = encoder def forward(self, x): emb = self.encoder(x) projection = self.model(emb) return emb, projection model = ContrastiveModel(model=projection_head, encoder=encoder) optimizer = Adam(model.parameters(), lr=LR) # 3. criterion with triplets sampling criterion = NTXentLoss(tau=0.1) callbacks = [ dl.ControlFlowCallback( dl.CriterionCallback( input_key="projection_left", target_key="projection_right", metric_key="loss" ), loaders="train", ), dl.SklearnModelCallback( feature_key="embedding_left", target_key="target", train_loader="train", valid_loaders="valid", model_fn=RandomForestClassifier, predict_method="predict_proba", predict_key="sklearn_predict", random_state=RANDOM_STATE, n_estimators=10, ), dl.ControlFlowCallback( dl.AccuracyCallback( target_key="target", input_key="sklearn_predict", topk_args=(1, 3) ), loaders="valid", ), ] runner = dl.ContrastiveRunner() logdir = "./logdir" runner.train( model=model, engine=engine or dl.DeviceEngine(device), criterion=criterion, optimizer=optimizer, callbacks=callbacks, loaders={"train": train_loader, "valid": train_loader}, verbose=True, logdir=logdir, valid_loader="train", valid_metric="loss", minimize_valid_metric=True, num_epochs=10, )
- __init__(model: Optional[Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]]] = None, engine: Optional[catalyst.core.engine.IEngine] = None, input_key: str = 'features', target_key: str = 'target', loss_key: str = 'loss', augemention_prefix: str = 'augment', projection_prefix: str = 'projection', embedding_prefix: str = 'embedding', loss_mode_prefix: str = 'projection')[source]¶
Init.
- get_callbacks(stage: str) OrderedDict[str, Callback] [source]¶
Prepares the callbacks for selected stage.
- Parameters
stage – stage name
- Returns
dictionary with stage callbacks
- predict_batch(batch: Mapping[str, Any], **kwargs) Mapping[str, Any] [source]¶
Run model inference on specified data batch.
Warning
You should not override this method. If you need specific model call, override forward() method
- Parameters
batch – dictionary with data batch from DataLoader.
**kwargs – additional kwargs to pass to the model
- Returns
model output dictionary
- Return type
Mapping[str, Any]
Config API¶
ConfigRunner¶
- class catalyst.runners.config.ConfigRunner(config: Dict)[source]¶
Bases:
catalyst.core.runner.IRunner
Runner created from a dictionary configuration file. Used for Catalyst Config API.
- Parameters
config – dictionary with parameters
Note
Please follow the minimal examples sections for use cases.
Examples:
dataset = SomeDataset() runner = SupervisedConfigRunner( config={ "args": {"logdir": logdir}, "model": {"_target_": "SomeModel", "in_features": 4, "out_features": 2}, "engine": {"_target_": "DeviceEngine", "device": device}, "stages": { "stage1": { "num_epochs": 10, "criterion": {"_target_": "MSELoss"}, "optimizer": {"_target_": "Adam", "lr": 1e-3}, "loaders": {"batch_size": 4, "num_workers": 0}, "callbacks": { "criterion": { "_target_": "CriterionCallback", "metric_key": "loss", "input_key": "logits", "target_key": "targets", }, "optimizer": { "_target_": "OptimizerCallback", "metric_key": "loss" }, }, }, }, } ) runner.get_datasets = lambda *args, **kwargs: { "train": dataset, "valid": dataset, } runner.run()
- get_callbacks(stage: str) OrderedDict[str, Callback] [source]¶
Returns the callbacks for a given stage.
- get_criterion(stage: str) Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] [source]¶
Returns the criterion for a given stage.
- get_datasets(stage: str) OrderedDict[str, Dataset] [source]¶
Returns datasets for a given stage.
- Parameters
stage – stage name
- Returns
datasets objects
- Return type
Dict
- get_engine() catalyst.core.engine.IEngine [source]¶
Returns the engine for the run.
- get_loaders(stage: str) OrderedDict[str, DataLoader] [source]¶
Returns loaders for a given stage.
- Parameters
stage – stage name
- Returns
loaders objects
- Return type
Dict
- get_loggers() Dict[str, catalyst.core.logger.ILogger] [source]¶
Returns the loggers for the run.
- get_model(stage: str) Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] [source]¶
Returns the model for a given stage.
- get_optimizer(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]], stage: str) Union[torch.optim.optimizer.Optimizer, Dict[str, torch.optim.optimizer.Optimizer]] [source]¶
Returns the optimizer for a given stage and epoch.
- Parameters
model – model or a dict of models
stage – current stage name
- Returns
optimizer for selected stage and epoch
- get_samplers(stage: str) OrderedDict[str, Sampler] [source]¶
Returns samplers for a given stage.
- Parameters
stage – stage name
- Returns
Dict of samplers
- get_scheduler(optimizer: Union[torch.optim.optimizer.Optimizer, Dict[str, torch.optim.optimizer.Optimizer]], stage: str) Union[torch.optim.lr_scheduler._LRScheduler, Dict[str, torch.optim.lr_scheduler._LRScheduler]] [source]¶
Returns the scheduler for a given stage.
- get_stage_len(stage: str) int [source]¶
Returns number of epochs for the selected stage.
- Parameters
stage – current stage
- Returns
number of epochs in stage
Example:
>>> runner.get_stage_len("pretraining") 3
- get_trial() catalyst.core.trial.ITrial [source]¶
Returns the trial for the run.
- property hparams: Dict¶
Returns hyper parameters
- property logdir: str¶
Experiment’s logdir for artefacts and logging.
- property name: str¶
Returns run name for monitoring tools.
- property seed: int¶
Experiment’s seed for reproducibility.
- property stages: List[str]¶
Experiment’s stage names.
SupervisedConfigRunner¶
- class catalyst.runners.config.SupervisedConfigRunner(config: Optional[Dict] = None, input_key: Any = 'features', output_key: Any = 'logits', target_key: str = 'targets', loss_key: str = 'loss')[source]¶
Bases:
catalyst.runners.supervised.ISupervisedRunner
,catalyst.runners.config.ConfigRunner
ConfigRunner for supervised tasks
- Parameters
config – dictionary with parameters
input_key – key in
runner.batch
dict mapping for model inputoutput_key – key for
runner.batch
to store model outputtarget_key – key in
runner.batch
dict mapping for targetloss_key – key for
runner.batch_metrics
to store criterion loss output
Note
Please follow the minimal examples sections for use cases.
Examples:
dataset = SomeDataset() runner = SupervisedConfigRunner( config={ "args": {"logdir": logdir}, "model": {"_target_": "SomeModel", "in_features": 4, "out_features": 2}, "engine": {"_target_": "DeviceEngine", "device": device}, "stages": { "stage1": { "num_epochs": 10, "criterion": {"_target_": "MSELoss"}, "optimizer": {"_target_": "Adam", "lr": 1e-3}, "loaders": { "batch_size": 4, "num_workers": 0, "datasets": { "train": { "_target_": "SelfSupervisedDatasetWrapper", "dataset": dataset }, "transforms": ..., "transform_original": ..., }, }, "callbacks": { "criterion": { "_target_": "CriterionCallback", "metric_key": "loss", "input_key": "logits", "target_key": "targets", }, "optimizer": { "_target_": "OptimizerCallback", "metric_key": "loss" }, }, }, }, } ) runner.run()
SelfSupervisedConfigRunner¶
- class catalyst.runners.config.SelfSupervisedConfigRunner(config: Optional[Dict] = None, input_key: str = 'features', target_key: str = 'target', loss_key: str = 'loss', augemention_prefix: str = 'augment', projection_prefix: str = 'projection', embedding_prefix: str = 'embedding')[source]¶
Bases:
catalyst.runners.self_supervised.ISelfSupervisedRunner
,catalyst.runners.config.ConfigRunner
ConfigRunner for contrastive tasks
- Parameters
config – dictionary with parameters
input_key – key in
runner.batch
dict mapping for model inputtarget_key – key in
runner.batch
dict mapping for targetloss_key – key for
runner.batch_metrics
to store criterion loss outputaugemention_prefix – key for
runner.batch
to sample augumentionsprojection_prefix – key for
runner.batch
to store model projectionembedding_prefix – key for runner.batch` to store model embeddings
Note
Please follow the minimal examples sections for use cases.
Examples:
dataset = SomeDataset() runner = SupervisedConfigRunner( config={ "args": {"logdir": logdir}, "model": {"_target_": "SomeContrastiveModel", ...}, "engine": {"_target_": "DeviceEngine", "device": device}, "stages": { "stage1": { "num_epochs": 10, "criterion": {"_target_": "NTXentLoss", "tau": 0.1}, "optimizer": {"_target_": "Adam", "lr": 1e-3}, "loaders": {"batch_size": 4, "num_workers": 0}, "callbacks": { "criterion": { "_target_": "CriterionCallback", "metric_key": "loss", "input_key": "logits", "target_key": "targets", }, "optimizer": { "_target_": "OptimizerCallback", "metric_key": "loss" }, }, }, }, } ) runner.get_datasets = lambda *args, **kwargs: { "train": dataset, "valid": dataset, } runner.run()
Hydra API¶
HydraRunner¶
- class catalyst.runners.hydra.HydraRunner(cfg: omegaconf.dictconfig.DictConfig)[source]¶
Bases:
catalyst.core.runner.IRunner
Runner created from a hydra configuration file.
- Parameters
cfg – Hydra dictionary with parameters
Note
Please follow the minimal examples sections for use cases.
- get_callbacks(stage: str) OrderedDict[str, Callback] [source]¶
Returns the callbacks for a given stage.
- get_criterion(stage: str) Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] [source]¶
Returns the criterion for a given stage.
- get_datasets(stage: str) OrderedDict[str, Dataset] [source]¶
Returns datasets for a given stage.
- Parameters
stage – stage name
- Returns
datasets objects
- Return type
Dict
- get_engine() catalyst.core.engine.IEngine [source]¶
Returns the engine for the run.
- get_loaders(stage: str) Dict[str, torch.utils.data.dataloader.DataLoader] [source]¶
Returns loaders for a given stage.
- Parameters
stage – stage name
- Returns
loaders objects
- Return type
Dict
- get_loggers() Dict[str, catalyst.core.logger.ILogger] [source]¶
Returns the loggers for the run.
- get_model(stage: str) Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]] [source]¶
Returns the model for a given stage.
- get_optimizer(model: Union[torch.nn.modules.module.Module, Dict[str, torch.nn.modules.module.Module]], stage: str) Union[torch.optim.optimizer.Optimizer, Dict[str, torch.optim.optimizer.Optimizer]] [source]¶
Returns the optimizer for a given stage and epoch.
- Parameters
model – model or a dict of models
stage – current stage name
- Returns
optimizer for selected stage and epoch
- get_samplers(stage: str) OrderedDict[str, Sampler] [source]¶
Returns samplers for a given stage.
- Parameters
stage – stage name
- Returns
Dict of samplers
- get_scheduler(optimizer: Union[torch.optim.optimizer.Optimizer, Dict[str, torch.optim.optimizer.Optimizer]], stage: str) Union[torch.optim.lr_scheduler._LRScheduler, Dict[str, torch.optim.lr_scheduler._LRScheduler]] [source]¶
Returns the schedulers for a given stage.
- get_stage_len(stage: str) int [source]¶
Returns number of epochs for the selected stage.
- Parameters
stage – current stage
- Returns
number of epochs in stage
Example:
>>> runner.get_stage_len("pretraining") 3
- get_transform(params: omegaconf.dictconfig.DictConfig) Callable [source]¶
Returns the data transforms for a dataset.
- Parameters
params – parameters of the transformation
- Returns
Data transformations to use
- get_trial() catalyst.core.trial.ITrial [source]¶
Returns the trial for the run.
- property hparams: collections.OrderedDict¶
Hyperparameters
- property logdir: str¶
Experiment’s logdir for artefacts and logging.
- property name: str¶
Returns run name for monitoring tools.
- property seed: int¶
Experiment’s seed for reproducibility.
- property stages: List[str]¶
Experiment’s stage names.
SupervisedHydraRunner¶
- class catalyst.runners.hydra.SupervisedHydraRunner(cfg: Optional[omegaconf.dictconfig.DictConfig] = None, input_key: Any = 'features', output_key: Any = 'logits', target_key: str = 'targets', loss_key: str = 'loss')[source]¶
Bases:
catalyst.runners.supervised.ISupervisedRunner
,catalyst.runners.hydra.HydraRunner
HydraRunner for supervised tasks
- Parameters
cfg – Hydra dictionary with parameters
input_key – key in
runner.batch
dict mapping for model inputoutput_key – key for
runner.batch
to store model outputtarget_key – key in
runner.batch
dict mapping for targetloss_key – key for
runner.batch_metrics
to store criterion loss output
Note
Please follow the minimal examples sections for use cases.