Shortcuts

Contrib

NN

Criterion

class catalyst.contrib.nn.criterion.ce.MaskCrossEntropyLoss(*args, target_name: str = 'targets', mask_name: str = 'mask', **kwargs)[source]

Bases: torch.nn.modules.loss.CrossEntropyLoss

@TODO: Docs. Contribution is welcome.

__init__(*args, target_name: str = 'targets', mask_name: str = 'mask', **kwargs)[source]

@TODO: Docs. Contribution is welcome.

forward(input: torch.Tensor, target_mask: torch.Tensor) → torch.Tensor[source]

Calculates loss between input and target_mask tensors.

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.ce.SymmetricCrossEntropyLoss(alpha: float = 1.0, beta: float = 1.0)[source]

Bases: torch.nn.modules.module.Module

The Symmetric Cross Entropy loss.

It has been proposed in Symmetric Cross Entropy for Robust Learning with Noisy Labels.

__init__(alpha: float = 1.0, beta: float = 1.0)[source]
Parameters
  • alpha (float) – corresponds to overfitting issue of CE

  • beta (float) – corresponds to flexible exploration on the robustness of RCE

forward(input: torch.Tensor, target: torch.Tensor) → torch.Tensor[source]

Calculates loss between input and target tensors.

Parameters
  • input (torch.Tensor) – input tensor of size (batch_size, num_classes)

  • target (torch.Tensor) – target tensor of size (batch_size), where values of a vector correspond to class index

class catalyst.contrib.nn.criterion.ce.NaiveCrossEntropyLoss(size_average=True)[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(size_average=True)[source]

@TODO: Docs. Contribution is welcome.

forward(input: torch.Tensor, target: torch.Tensor) → torch.Tensor[source]

Calculates loss between input and target tensors.

Parameters
  • input (torch.Tensor) – input tensor of shape …

  • target (torch.Tensor) – target tensor of shape …

@TODO: Docs (add shapes). Contribution is welcome.

class catalyst.contrib.nn.criterion.contrastive.ContrastiveEmbeddingLoss(margin=1.0, reduction='mean')[source]

Bases: torch.nn.modules.module.Module

The Contrastive embedding loss.

It has been proposed in Dimensionality Reduction by Learning an Invariant Mapping.

__init__(margin=1.0, reduction='mean')[source]
Parameters
  • margin – margin parameter

  • reduction – criterion reduction type

forward(embeddings_left: torch.Tensor, embeddings_right: torch.Tensor, distance_true) → torch.Tensor[source]

Forward propagation method for the contrastive loss.

Parameters
  • embeddings_left (torch.Tensor) – left objects embeddings

  • embeddings_right (torch.Tensor) – right objects embeddings

  • distance_true – true distances

Returns

loss

Return type

torch.Tensor

class catalyst.contrib.nn.criterion.contrastive.ContrastiveDistanceLoss(margin=1.0, reduction='mean')[source]

Bases: torch.nn.modules.module.Module

The Contrastive distance loss.

@TODO: Docs. Contribution is welcome.

__init__(margin=1.0, reduction='mean')[source]
Parameters
  • margin – margin parameter

  • reduction (str) – criterion reduction type

forward(distance_pred, distance_true) → torch.Tensor[source]

Forward propagation method for the contrastive loss.

Parameters
  • distance_pred – predicted distances

  • distance_true – true distances

Returns

loss

Return type

torch.Tensor

class catalyst.contrib.nn.criterion.contrastive.ContrastivePairwiseEmbeddingLoss(margin=1.0, reduction='mean')[source]

Bases: torch.nn.modules.module.Module

ContrastivePairwiseEmbeddingLoss – proof of concept criterion.

Still work in progress.

@TODO: Docs. Contribution is welcome.

__init__(margin=1.0, reduction='mean')[source]
Parameters
  • margin – margin parameter

  • reduction – criterion reduction type

forward(embeddings_pred, embeddings_true) → torch.Tensor[source]

Forward propagation method for the contrastive loss.

Work in progress.

Parameters
  • embeddings_pred – predicted embeddings

  • embeddings_true – true embeddings

Returns

loss

Return type

torch.Tensor

class catalyst.contrib.nn.criterion.dice.BCEDiceLoss(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid', bce_weight: float = 0.5, dice_weight: float = 0.5)[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid', bce_weight: float = 0.5, dice_weight: float = 0.5)[source]

@TODO: Docs. Contribution is welcome.

forward(outputs, targets)[source]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.dice.DiceLoss(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]

@TODO: Docs. Contribution is welcome.

forward(logits: torch.Tensor, targets: torch.Tensor)[source]

Calculates loss between logits and target tensors.

@TODO: Docs. Contribution is welcome

class catalyst.contrib.nn.criterion.focal.FocalLossBinary(ignore: int = None, reduced: bool = False, gamma: float = 2.0, alpha: float = 0.25, threshold: float = 0.5, reduction: str = 'mean')[source]

Bases: torch.nn.modules.loss._Loss

Compute focal loss for binary classification problem.

It has been proposed in Focal Loss for Dense Object Detection paper.

@TODO: Docs (add Example). Contribution is welcome.

__init__(ignore: int = None, reduced: bool = False, gamma: float = 2.0, alpha: float = 0.25, threshold: float = 0.5, reduction: str = 'mean')[source]

@TODO: Docs. Contribution is welcome.

forward(logits, targets)[source]
Parameters
  • logits – [bs; …]

  • targets – [bs; …]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.focal.FocalLossMultiClass(ignore: int = None, reduced: bool = False, gamma: float = 2.0, alpha: float = 0.25, threshold: float = 0.5, reduction: str = 'mean')[source]

Bases: catalyst.contrib.nn.criterion.focal.FocalLossBinary

Compute focal loss for multi-class problem. Ignores targets having -1 label.

It has been proposed in Focal Loss for Dense Object Detection paper.

@TODO: Docs (add Example). Contribution is welcome.

forward(logits, targets)[source]
Parameters
  • logits – [bs; num_classes; …]

  • targets – [bs; …]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.gan.MeanOutputLoss[source]

Bases: torch.nn.modules.module.Module

Criterion to compute simple mean of the output, completely ignoring target (maybe useful e.g. for WGAN real/fake validity averaging.

forward(output, target)[source]

Compute criterion.

@TODO: Docs (add typing). Contribution is welcome.

class catalyst.contrib.nn.criterion.gan.GradientPenaltyLoss[source]

Bases: torch.nn.modules.module.Module

Criterion to compute gradient penalty.

WARN: SHOULD NOT BE RUN WITH CriterionCallback,

use special GradientPenaltyCallback instead

forward(fake_data, real_data, critic, critic_condition_args)[source]

Compute gradient penalty.

Parameters

@TODO – Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.huber.HuberLoss(clip_delta=1.0, reduction='mean')[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(clip_delta=1.0, reduction='mean')[source]

@TODO: Docs. Contribution is welcome.

forward(y_pred: torch.Tensor, y_true: torch.Tensor, weights=None) → torch.Tensor[source]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.iou.IoULoss(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]

Bases: torch.nn.modules.module.Module

The intersection over union (Jaccard) loss.

@TODO: Docs. Contribution is welcome.

__init__(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]
Parameters
  • eps (float) – epsilon to avoid zero division

  • threshold (float) – threshold for outputs binarization

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of 'none', 'Sigmoid', 'Softmax2d'

forward(outputs, targets)[source]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.iou.BCEIoULoss(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid', reduction: str = 'mean')[source]

Bases: torch.nn.modules.module.Module

The Intersection over union (Jaccard) with BCE loss.

@TODO: Docs. Contribution is welcome.

__init__(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid', reduction: str = 'mean')[source]
Parameters
  • eps (float) – epsilon to avoid zero division

  • threshold (float) – threshold for outputs binarization

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of 'none', 'Sigmoid', 'Softmax2d'

  • reduction (str) – Specifies the reduction to apply to the output of BCE

forward(outputs, targets)[source]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.lovasz.LovaszLossBinary(per_image=False, ignore=None)[source]

Bases: torch.nn.modules.loss._Loss

Creates a criterion that optimizes a binary Lovasz loss.

It has been proposed in The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks.

__init__(per_image=False, ignore=None)[source]

@TODO: Docs. Contribution is welcome.

forward(logits, targets)[source]

Forward propagation method for the Lovasz loss.

Parameters
  • logits – [bs; …]

  • targets – [bs; …]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.lovasz.LovaszLossMultiClass(per_image=False, ignore=None)[source]

Bases: torch.nn.modules.loss._Loss

Creates a criterion that optimizes a multi-class Lovasz loss.

It has been proposed in The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks.

__init__(per_image=False, ignore=None)[source]

@TODO: Docs. Contribution is welcome.

forward(logits, targets)[source]

Forward propagation method for the Lovasz loss.

Parameters
  • logits – [bs; num_classes; …]

  • targets – [bs; …]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.lovasz.LovaszLossMultiLabel(per_image=False, ignore=None)[source]

Bases: torch.nn.modules.loss._Loss

Creates a criterion that optimizes a multi-label Lovasz loss.

It has been proposed in The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks.

__init__(per_image=False, ignore=None)[source]

@TODO: Docs. Contribution is welcome.

forward(logits, targets)[source]

Forward propagation method for the Lovasz loss.

Parameters
  • logits – [bs; num_classes; …]

  • targets – [bs; num_classes; …]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.margin.MarginLoss(alpha: float = 0.2, beta: float = 1.0, skip_labels: Union[int, List[int]] = -1)[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(alpha: float = 0.2, beta: float = 1.0, skip_labels: Union[int, List[int]] = -1)[source]
Parameters
  • alpha (float) –

  • beta (float) –

  • skip_labels (int or List[int]) –

@TODO: Docs. Contribution is welcome.

forward(embeddings: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]

Forward propagation method for the margin loss.

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.triplet.TripletLoss(margin: float = 0.3)[source]

Bases: torch.nn.modules.module.Module

Triplet loss with hard positive/negative mining.

Reference:

Code imported from https://github.com/NegatioN/OnlineMiningTripletLoss

__init__(margin: float = 0.3)[source]
Parameters

margin (float) – margin for triplet

forward(embeddings, targets)[source]

Forward propagation method for the triplet loss.

Parameters
  • embeddings – tensor of shape (batch_size, embed_dim)

  • targets – labels of the batch, of size (batch_size,)

Returns

scalar tensor containing the triplet loss

Return type

triplet_loss

class catalyst.contrib.nn.criterion.triplet.TripletPairwiseEmbeddingLoss(margin: float = 0.3, reduction: str = 'mean')[source]

Bases: torch.nn.modules.module.Module

TripletPairwiseEmbeddingLoss – proof of concept criterion.

Still work in progress.

@TODO: Docs. Contribution is welcome.

__init__(margin: float = 0.3, reduction: str = 'mean')[source]
Parameters
  • margin (float) – margin parameter

  • reduction (str) – criterion reduction type

forward(embeddings_pred, embeddings_true)[source]

Work in progress.

Parameters
  • embeddings_pred – predicted embeddings with shape [batch_size, embedding_size]

  • embeddings_true – true embeddings with shape [batch_size, embedding_size]

Returns

loss

Return type

torch.Tensor

class catalyst.contrib.nn.criterion.wing.WingLoss(width: int = 5, curvature: float = 0.5, reduction: str = 'mean')[source]

Bases: torch.nn.modules.module.Module

Creates a criterion that optimizes a Wing loss.

It has been proposed in Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks.

Examples

@TODO: Docs. Contribution is welcome.

Main origins of inspiration:

https://github.com/BloodAxe/pytorch-toolbelt

__init__(width: int = 5, curvature: float = 0.5, reduction: str = 'mean')[source]
Parameters

@TODO – Docs. Contribution is welcome.

forward(outputs: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]
Parameters

@TODO – Docs. Contribution is welcome.

Modules

class catalyst.contrib.nn.modules.common.Flatten[source]

Bases: torch.nn.modules.module.Module

Flattens the input. Does not affect the batch size.

@TODO: Docs (add Example). Contribution is welcome.

__init__()[source]

@TODO: Docs. Contribution is welcome.

forward(x)[source]

Forward call.

class catalyst.contrib.nn.modules.common.Lambda(lambda_fn)[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(lambda_fn)[source]

@TODO: Docs. Contribution is welcome.

forward(x)[source]

Forward call.

class catalyst.contrib.nn.modules.common.Normalize(**normalize_kwargs)[source]

Bases: torch.nn.modules.module.Module

Performs \(L_p\) normalization of inputs over specified dimension.

@TODO: Docs (add Example). Contribution is welcome.

__init__(**normalize_kwargs)[source]
Parameters

**normalize_kwargs – see torch.nn.functional.normalize params

forward(x)[source]

Forward call.

class catalyst.contrib.nn.modules.lama.TemporalLastPooling[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]

Forward call.

class catalyst.contrib.nn.modules.lama.TemporalAvgPooling[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]

Forward call.

class catalyst.contrib.nn.modules.lama.TemporalMaxPooling[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]

Forward call.

class catalyst.contrib.nn.modules.lama.TemporalDropLastWrapper(net)[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(net)[source]

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None)[source]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.modules.lama.TemporalAttentionPooling(in_features, activation=None, kernel_size=1, **params)[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(in_features, activation=None, kernel_size=1, **params)[source]

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]
Parameters

x (torch.Tensor) – tensor of size (batch_size, history_len, feature_size)

@TODO: Docs. Contribution is welcome.

name2activation = {'sigmoid': Sigmoid(), 'softmax': Softmax(dim=1), 'tanh': Tanh()}
class catalyst.contrib.nn.modules.lama.TemporalConcatPooling(in_features, history_len=1)[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(in_features, history_len=1)[source]

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]
Parameters

x (torch.Tensor) – tensor of size (batch_size, history_len, feature_size)

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.modules.lama.LamaPooling(in_features, groups=None)[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(in_features, groups=None)[source]

@TODO: Docs. Contribution is welcome.

available_groups = ['last', 'avg', 'avg_droplast', 'max', 'max_droplast', 'sigmoid', 'sigmoid_droplast', 'softmax', 'softmax_droplast', 'tanh', 'tanh_droplast']
forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]
Parameters

x (torch.Tensor) – tensor of size (batch_size, history_len, feature_size)

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.modules.pooling.GlobalAttnPool2d(in_features, activation_fn='Sigmoid')[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(in_features, activation_fn='Sigmoid')[source]

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor) → torch.Tensor[source]

Forward call.

static out_features(in_features)[source]

Returns number of channels produced by the pooling.

Parameters

in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalAvgAttnPool2d(in_features, activation_fn='Sigmoid')[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs (add Example). Contribution is welcome.

__init__(in_features, activation_fn='Sigmoid')[source]

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor) → torch.Tensor[source]

Forward call.

static out_features(in_features)[source]

Returns number of channels produced by the pooling.

Parameters

in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalAvgPool2d[source]

Bases: torch.nn.modules.module.Module

Applies a 2D global average pooling operation over an input signal composed of several input planes.

@TODO: Docs (add Example). Contribution is welcome.

__init__()[source]

Constructor method for the GlobalAvgPool2d class.

forward(x: torch.Tensor) → torch.Tensor[source]

Forward call.

static out_features(in_features)[source]

Returns number of channels produced by the pooling.

Parameters

in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalConcatAttnPool2d(in_features, activation_fn='Sigmoid')[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs (add Example). Contribution is welcome.

__init__(in_features, activation_fn='Sigmoid')[source]

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor) → torch.Tensor[source]

Forward call.

static out_features(in_features)[source]

Returns number of channels produced by the pooling.

Parameters

in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalConcatPool2d[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs (add Example). Contribution is welcome.

__init__()[source]

Constructor method for the GlobalConcatPool2d class.

forward(x: torch.Tensor) → torch.Tensor[source]

Forward call.

static out_features(in_features)[source]

Returns number of channels produced by the pooling.

Parameters

in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalMaxAttnPool2d(in_features, activation_fn='Sigmoid')[source]

Bases: torch.nn.modules.module.Module

@TODO: Docs (add Example). Contribution is welcome.

__init__(in_features, activation_fn='Sigmoid')[source]

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor) → torch.Tensor[source]

Forward call.

static out_features(in_features)[source]

Returns number of channels produced by the pooling.

Parameters

in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalMaxPool2d[source]

Bases: torch.nn.modules.module.Module

Applies a 2D global max pooling operation over an input signal composed of several input planes.

@TODO: Docs (add Example). Contribution is welcome.

__init__()[source]

Constructor method for the GlobalMaxPool2d class.

forward(x: torch.Tensor) → torch.Tensor[source]

Forward call.

static out_features(in_features)[source]

Returns number of channels produced by the pooling.

Parameters

in_features – number of channels in the input sample

Optimizers

class catalyst.contrib.nn.optimizers.lamb.Lamb(params, lr: Optional[float] = 0.001, betas: Optional[Tuple[float, float]] = (0.9, 0.999), eps: Optional[float] = 1e-06, weight_decay: Optional[float] = 0.0, adam: Optional[bool] = False)[source]

Bases: torch.optim.optimizer.Optimizer

Implements Lamb algorithm.

It has been proposed in Training BERT in 76 minutes.

__init__(params, lr: Optional[float] = 0.001, betas: Optional[Tuple[float, float]] = (0.9, 0.999), eps: Optional[float] = 1e-06, weight_decay: Optional[float] = 0.0, adam: Optional[bool] = False)[source]
Parameters
  • params (iterable) – iterable of parameters to optimize or dicts defining parameter groups

  • lr (float, optional) – learning rate (default: 1e-3)

  • betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))

  • eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)

  • weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)

  • adam (bool, optional) – always use trust ratio = 1, which turns this into Adam. Useful for comparison purposes.

step(closure: Optional[Callable] = None)[source]

Makes optimizer step.

Parameters

closure (callable, optional) – A closure that reevaluates the model and returns the loss.

catalyst.contrib.nn.optimizers.lamb.log_lamb_rs(optimizer: torch.optim.optimizer.Optimizer, event_writer, token_count: int)[source]

Log a histogram of trust ratio scalars in across layers.

class catalyst.contrib.nn.optimizers.lookahead.Lookahead(optimizer: torch.optim.optimizer.Optimizer, k: int = 5, alpha: float = 0.5)[source]

Bases: torch.optim.optimizer.Optimizer

Implements Lookahead algorithm.

It has been proposed in Lookahead Optimizer: k steps forward, 1 step back.

Main origins of inspiration:

https://github.com/alphadl/lookahead.pytorch (MIT License)

__init__(optimizer: torch.optim.optimizer.Optimizer, k: int = 5, alpha: float = 0.5)[source]

@TODO: Docs. Contribution is welcome.

add_param_group(param_group)[source]

@TODO: Docs. Contribution is welcome.

classmethod get_from_params(params: Dict, base_optimizer_params: Dict = None, **kwargs) → catalyst.contrib.nn.optimizers.lookahead.Lookahead[source]

@TODO: Docs. Contribution is welcome.

load_state_dict(state_dict)[source]

@TODO: Docs. Contribution is welcome.

state_dict()[source]

@TODO: Docs. Contribution is welcome.

step(closure: Optional[Callable] = None)[source]

Makes optimizer step.

Parameters

closure (callable, optional) – A closure that reevaluates the model and returns the loss.

update(group)[source]

@TODO: Docs. Contribution is welcome.

update_lookahead()[source]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.optimizers.qhadamw.QHAdamW(params, lr=0.001, betas=(0.995, 0.999), nus=(0.7, 1.0), weight_decay=0.0, eps=1e-08)[source]

Bases: torch.optim.optimizer.Optimizer

Implements QHAdam algorithm.

Combines QHAdam algorithm that was proposed in Quasi-hyperbolic momentum and Adam for deep learning with weight decay decoupling from Decoupled Weight Decay Regularization paper.

Example

>>> optimizer = QHAdamW(
...     model.parameters(),
...     lr=3e-4, nus=(0.8, 1.0), betas=(0.99, 0.999))
>>> optimizer.zero_grad()
>>> loss_fn(model(input), target).backward()
>>> optimizer.step()
Main origins of inspiration:

https://github.com/iprally/qhadamw-pytorch/blob/master/qhadamw.py (MIT License)

__init__(params, lr=0.001, betas=(0.995, 0.999), nus=(0.7, 1.0), weight_decay=0.0, eps=1e-08)[source]
Parameters
  • params (iterable) – iterable of parameters to optimize or dicts defining parameter groups

  • lr (float, optional) – learning rate (\(\alpha\) from the paper) (default: 1e-3)

  • betas (Tuple[float, float], optional) – coefficients used for computing running averages of the gradient and its square (default: (0.995, 0.999))

  • nus (Tuple[float, float], optional) – immediate discount factors used to estimate the gradient and its square (default: (0.7, 1.0))

  • eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)

  • weight_decay (float, optional) – weight decay (L2 regularization coefficient, times two) (default: 0.0)

step(closure: Optional[Callable] = None)[source]

Makes optimizer step.

Parameters

closure (callable, optional) – A closure that reevaluates the model and returns the loss.

class catalyst.contrib.nn.optimizers.radam.RAdam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)[source]

Bases: torch.optim.optimizer.Optimizer

Implements RAdam algorithm.

It has been proposed in On the Variance of the Adaptive Learning Rate and Beyond.

@TODO: Docs (add Example). Contribution is welcome

Main origins of inspiration:

https://github.com/LiyuanLucasLiu/RAdam (Apache-2.0 License)

__init__(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)[source]

@TODO: Docs. Contribution is welcome.

step(closure: Optional[Callable] = None)[source]

Makes optimizer step.

Parameters

closure (callable, optional) – A closure that reevaluates the model and returns the loss.

class catalyst.contrib.nn.optimizers.ralamb.Ralamb(params: Iterable, lr: float = 0.001, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0)[source]

Bases: torch.optim.optimizer.Optimizer

RAdam optimizer with LARS/LAMB tricks.

Main origins of inspiration:

https://github.com/mgrankin/over9000/blob/master/ralamb.py (Apache-2.0 License)

__init__(params: Iterable, lr: float = 0.001, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0)[source]
Parameters
  • params (iterable) – iterable of parameters to optimize or dicts defining parameter groups

  • lr (float, optional) – learning rate (default: 1e-3)

  • betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))

  • eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)

  • weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)

step(closure: Optional[Callable] = None)[source]

Makes optimizer step.

Parameters

closure (callable, optional) – A closure that reevaluates the model and returns the loss.

Schedulers

class catalyst.contrib.nn.schedulers.base.BaseScheduler(optimizer, last_epoch=-1)[source]

Bases: torch.optim.lr_scheduler._LRScheduler, abc.ABC

Base class for all schedulers with momentum update.

get_momentum() → List[float][source]

Function that returns the new momentum for optimizer.

Returns

calculated momentum for every param groups

Return type

List[float]

step(epoch: Optional[int] = None) → None[source]

Make one scheduler step.

Parameters

epoch (int, optional) – current epoch num

class catalyst.contrib.nn.schedulers.base.BatchScheduler(optimizer, last_epoch=-1)[source]

Bases: catalyst.contrib.nn.schedulers.base.BaseScheduler, abc.ABC

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.schedulers.onecycle.OneCycleLRWithWarmup(optimizer: torch.optim.optimizer.Optimizer, num_steps: int, lr_range=(1.0, 0.005), init_lr: float = None, warmup_steps: int = 0, warmup_fraction: float = None, decay_steps: int = 0, decay_fraction: float = None, momentum_range=(0.8, 0.99, 0.999), init_momentum: float = None)[source]

Bases: catalyst.contrib.nn.schedulers.base.BatchScheduler

OneCycle scheduler with warm-up & lr decay stages.

First stage increases lr from init_lr to max_lr, and called warmup. Also it decreases momentum from init_momentum to min_momentum. Takes warmup_steps steps

Second is annealing stage. Decrease lr from max_lr to min_lr, Increase momentum from min_momentum to max_momentum.

Third, optional, lr decay.

__init__(optimizer: torch.optim.optimizer.Optimizer, num_steps: int, lr_range=(1.0, 0.005), init_lr: float = None, warmup_steps: int = 0, warmup_fraction: float = None, decay_steps: int = 0, decay_fraction: float = None, momentum_range=(0.8, 0.99, 0.999), init_momentum: float = None)[source]
Parameters
  • optimizer – PyTorch optimizer

  • num_steps (int) – total number of steps

  • lr_range – tuple with two or three elements (max_lr, min_lr, [final_lr])

  • init_lr (float, optional) – initial lr

  • warmup_steps (int) – count of steps for warm-up stage

  • warmup_fraction (float, optional) – fraction in [0; 1) to calculate number of warmup steps. Cannot be set together with warmup_steps

  • decay_steps (int) – count of steps for lr decay stage

  • decay_fraction (float, optional) – fraction in [0; 1) to calculate number of decay steps. Cannot be set together with decay_steps

  • momentum_range – tuple with two or three elements (min_momentum, max_momentum, [final_momentum])

  • init_momentum (float, optional) – initial momentum

get_lr() → List[float][source]

Function that returns the new lr for optimizer.

Returns

calculated lr for every param groups

Return type

List[float]

get_momentum() → List[float][source]

Function that returns the new momentum for optimizer.

Returns

calculated momentum for every param groups

Return type

List[float]

recalculate(loader_len: int, current_step: int) → None[source]

Recalculates total num_steps for batch mode.

Parameters
  • loader_len (int) – total count of batches in an epoch

  • current_step (int) – current step

reset()[source]

@TODO: Docs. Contribution is welcome.

Models

Segmentation

class catalyst.contrib.models.cv.segmentation.unet.ResnetUnet(num_classes: int = 1, arch: str = 'resnet18', pretrained: bool = True, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]

Bases: catalyst.contrib.models.cv.segmentation.core.ResnetUnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.unet.Unet(num_classes: int = 1, in_channels: int = 3, num_channels: int = 32, num_blocks: int = 4, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]

Bases: catalyst.contrib.models.cv.segmentation.core.UnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.linknet.Linknet(num_classes: int = 1, in_channels: int = 3, num_channels: int = 32, num_blocks: int = 4, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]

Bases: catalyst.contrib.models.cv.segmentation.core.UnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.linknet.ResnetLinknet(num_classes: int = 1, arch: str = 'resnet18', pretrained: bool = True, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]

Bases: catalyst.contrib.models.cv.segmentation.core.ResnetUnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.fpn.FPNUnet(num_classes: int = 1, in_channels: int = 3, num_channels: int = 32, num_blocks: int = 4, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]

Bases: catalyst.contrib.models.cv.segmentation.core.UnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.fpn.ResnetFPNUnet(num_classes: int = 1, arch: str = 'resnet18', pretrained: bool = True, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]

Bases: catalyst.contrib.models.cv.segmentation.core.ResnetUnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.psp.PSPnet(num_classes: int = 1, in_channels: int = 3, num_channels: int = 32, num_blocks: int = 4, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]

Bases: catalyst.contrib.models.cv.segmentation.core.UnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.psp.ResnetPSPnet(num_classes: int = 1, arch: str = 'resnet18', pretrained: bool = True, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]

Bases: catalyst.contrib.models.cv.segmentation.core.ResnetUnetSpec

@TODO: Docs. Contribution is welcome.

Registry

catalyst subpackage registries

catalyst.contrib.registry.Criterion(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters
  • factory – Factory instance

  • factories – More instances

  • name – Provided name for first instance. Use only when pass single instance.

  • named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Optimizer(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters
  • factory – Factory instance

  • factories – More instances

  • name – Provided name for first instance. Use only when pass single instance.

  • named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Scheduler(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters
  • factory – Factory instance

  • factories – More instances

  • name – Provided name for first instance. Use only when pass single instance.

  • named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Module(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters
  • factory – Factory instance

  • factories – More instances

  • name – Provided name for first instance. Use only when pass single instance.

  • named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Model(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters
  • factory – Factory instance

  • factories – More instances

  • name – Provided name for first instance. Use only when pass single instance.

  • named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Sampler(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters
  • factory – Factory instance

  • factories – More instances

  • name – Provided name for first instance. Use only when pass single instance.

  • named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Transform(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters
  • factory – Factory instance

  • factories – More instances

  • name – Provided name for first instance. Use only when pass single instance.

  • named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

Utilities

Argparse

catalyst.contrib.utils.argparse.boolean_flag(parser: argparse.ArgumentParser, name: str, default: Optional[bool] = False, help: str = None, shorthand: str = None) → None[source]

Add a boolean flag to a parser inplace.

Examples

>>> parser = argparse.ArgumentParser()
>>> boolean_flag(
>>>     parser, "flag", default=False, help="some flag", shorthand="f"
>>> )
Parameters
  • parser (argparse.ArgumentParser) – parser to add the flag to

  • name (str) – argument name –<name> will enable the flag, while –no-<name> will disable it

  • default (bool, optional) – default value of the flag

  • help (str) – help string for the flag

  • shorthand (str) – shorthand string for the argument

Compression

catalyst.contrib.utils.compression.pack(data)

Serialize the data into bytes using pickle.

Parameters

data – a value

Returns

Returns a bytes object serialized with pickle data.

catalyst.contrib.utils.compression.pack_if_needed(data)

Serialize the data into bytes using pickle.

Parameters

data – a value

Returns

Returns a bytes object serialized with pickle data.

catalyst.contrib.utils.compression.unpack(data)

Deserialize bytes into an object using pickle.

Parameters

bytes – a bytes object containing serialized with pickle data.

Returns

Returns a value deserialized from the bytes-like object.

catalyst.contrib.utils.compression.unpack_if_needed(data)

Deserialize bytes into an object using pickle.

Parameters

bytes – a bytes object containing serialized with pickle data.

Returns

Returns a value deserialized from the bytes-like object.

Confusion Matrix

catalyst.contrib.utils.confusion_matrix.calculate_tp_fp_fn(confusion_matrix: numpy.ndarray) → numpy.ndarray[source]

@TODO: Docs. Contribution is welcome.

catalyst.contrib.utils.confusion_matrix.calculate_confusion_matrix_from_arrays(ground_truth: numpy.ndarray, prediction: numpy.ndarray, num_classes: int) → numpy.ndarray[source]

Calculate confusion matrix for a given set of classes. If GT value is outside of the [0, num_classes) it is excluded.

Parameters
  • ground_truth (np.ndarray) –

  • prediction (np.ndarray) –

  • num_classes (int) –

@TODO: Docs . Contribution is welcome

catalyst.contrib.utils.confusion_matrix.calculate_confusion_matrix_from_tensors(y_pred_logits: torch.Tensor, y_true: torch.Tensor) → numpy.ndarray[source]

@TODO: Docs. Contribution is welcome.

Dataset

catalyst.contrib.utils.dataset.create_dataset(dirs: str, extension: str = None, process_fn: Callable[[str], object] = None, recursive: bool = False) → Dict[str, object][source]

Create dataset (dict like {key: [values]}) from vctk-like dataset:

dataset/
    cat/
        *.ext
    dog/
        *.ext
Parameters
  • dirs (str) – path to dirs, for example /home/user/data/**

  • extension (str) – data extension you are looking for

  • process_fn (Callable[[str], object]) – function(path_to_file) -> object process function for found files, by default

  • recursive (bool) – enables recursive globbing

Returns

dataset

Return type

dict

catalyst.contrib.utils.dataset.create_dataframe(dataset: Dict[str, object], **dataframe_args) → pandas.core.frame.DataFrame[source]

Create pd.DataFrame from dict like {key: [values]}.

Parameters
  • dataset – dict like {key: [values]}

  • **dataframe_args

    indexIndex or array-like

    Index to use for resulting frame. Will default to np.arange(n) if no indexing information part of input data and no index provided

    columnsIndex or array-like

    Column labels to use for resulting frame. Will default to np.arange(n) if no column labels are provided

    dtypedtype, default None

    Data type to force, otherwise infer

Returns

dataframe from giving dataset

Return type

pd.DataFrame

catalyst.contrib.utils.dataset.split_dataset_train_test(dataset: pandas.core.frame.DataFrame, **train_test_split_args) → Tuple[Dict[str, object], Dict[str, object]][source]

Split dataset in train and test parts.

Parameters
  • dataset – dict like dataset

  • **train_test_split_args

    test_sizefloat, int, or None (default is None)

    If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is automatically set to the complement of the train size. If train size is also None, test size is set to 0.25.

    train_sizefloat, int, or None (default is None)

    If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.

    random_stateint or RandomState

    Pseudo-random number generator state used for random sampling.

    stratifyarray-like or None (default is None)

    If not None, data is split in a stratified fashion, using this as the class labels.

Returns

train and test dicts

Misc

catalyst.contrib.utils.misc.args_are_not_none(*args: Optional[Any]) → bool[source]

Check that all arguments are not None.

Parameters

*args (Any) – values

Returns

True if all value were not None, False otherwise

Return type

bool

catalyst.contrib.utils.misc.make_tuple(tuple_like)[source]

Creates a tuple if given tuple_like value isn’t list or tuple.

Returns

tuple or list

catalyst.contrib.utils.misc.pairwise(iterable: Iterable[Any]) → Iterable[Any][source]

Iterate sequences by pairs.

Examples

>>> for i in pairwise([1, 2, 5, -3]):
>>>     print(i)
(1, 2)
(2, 5)
(5, -3)
Parameters

iterable – Any iterable sequence

Returns

pairwise iterator

Pandas

catalyst.contrib.utils.pandas.dataframe_to_list(dataframe: pandas.core.frame.DataFrame) → List[dict][source]

Converts dataframe to a list of rows (without indexes).

Parameters

dataframe (DataFrame) – input dataframe

Returns

list of rows

Return type

(List[dict])

catalyst.contrib.utils.pandas.folds_to_list(folds: Union[list, str, pandas.core.series.Series]) → List[int][source]

This function formats string or either list of numbers into a list of unique int.

Examples

>>> folds_to_list("1,2,1,3,4,2,4,6")
[1, 2, 3, 4, 6]
>>> folds_to_list([1, 2, 3.0, 5])
[1, 2, 3, 5]
Parameters

folds (Union[list, str, pd.Series]) – Either list of numbers or one string with numbers separated by commas or pandas series

Returns

list of unique ints

Return type

List[int]

Raises

ValueError – if value in string or array cannot be casted to int

catalyst.contrib.utils.pandas.split_dataframe(dataframe: pandas.core.frame.DataFrame, train_folds: List[int], valid_folds: Optional[List[int]] = None, infer_folds: Optional[List[int]] = None, tag2class: Optional[Dict[str, int]] = None, tag_column: str = None, class_column: str = None, seed: int = 42, n_folds: int = 5) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]

Split a Pandas DataFrame into folds.

Parameters
  • dataframe (pd.DataFrame) – input dataframe

  • train_folds (List[int]) – train folds

  • valid_folds (List[int], optional) – valid folds. If none takes all folds not included in train_folds

  • infer_folds (List[int], optional) – infer folds. If none takes all folds not included in train_folds and valid_folds

  • tag2class (Dict[str, int], optional) – mapping from label names into int

  • tag_column (str, optional) – column with label names

  • class_column (str, optional) – column to use for split

  • seed (int) – seed for split

  • n_folds (int) – number of folds

Returns

tuple with 4 dataframes

whole dataframe, train part, valid part and infer part

Return type

(tuple)

catalyst.contrib.utils.pandas.split_dataframe_on_column_folds(dataframe: pandas.core.frame.DataFrame, column: str, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]

Splits DataFrame into N folds.

Parameters
  • dataframe – a dataset

  • column – which column to use

  • random_state – seed for random shuffle

  • n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.split_dataframe_on_folds(dataframe: pandas.core.frame.DataFrame, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]

Splits DataFrame into N folds.

Parameters
  • dataframe – a dataset

  • random_state – seed for random shuffle

  • n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.split_dataframe_on_stratified_folds(dataframe: pandas.core.frame.DataFrame, class_column: str, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]

Splits DataFrame into N stratified folds.

Also see catalyst.data.sampler.BalanceClassSampler

Parameters
  • dataframe – a dataset

  • class_column – which column to use for split

  • random_state – seed for random shuffle

  • n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.split_dataframe_train_test(dataframe: pandas.core.frame.DataFrame, **train_test_split_args) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]

Split dataframe in train and test part.

Parameters
  • dataframe – pd.DataFrame to split

  • **train_test_split_args

    test_sizefloat, int, or None (default is None)

    If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is automatically set to the complement of the train size. If train size is also None, test size is set to 0.25.

    train_sizefloat, int, or None (default is None)

    If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.

    random_stateint or RandomState

    Pseudo-random number generator state used for random sampling.

    stratifyarray-like or None (default is None)

    If not None, data is split in a stratified fashion, using this as the class labels.

Returns

train and test DataFrames

Note

It exist cause sklearn split is overcomplicated.

catalyst.contrib.utils.pandas.separate_tags(dataframe: pandas.core.frame.DataFrame, tag_column: str = 'tag', tag_delim: str = ', ') → pandas.core.frame.DataFrame[source]

Separates values in class_column column.

Parameters
  • dataframe – a dataset

  • tag_column – column name to separate values

  • tag_delim – delimiter to separate values

Returns

new dataframe

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.read_multiple_dataframes(in_csv_train: str = None, in_csv_valid: str = None, in_csv_infer: str = None, tag2class: Optional[Dict[str, int]] = None, class_column: str = None, tag_column: str = None) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]

This function reads train/valid/infer dataframes from giving paths.

Parameters
  • in_csv_train (str) – paths to train csv separated by commas

  • in_csv_valid (str) – paths to valid csv separated by commas

  • in_csv_infer (str) – paths to infer csv separated by commas

  • tag2class (Dict[str, int], optional) – mapping from label names into int

  • tag_column (str, optional) – column with label names

  • class_column (str, optional) – column to use for split

Returns

tuple with 4 dataframes

whole dataframe, train part, valid part and infer part

Return type

(tuple)

catalyst.contrib.utils.pandas.map_dataframe(dataframe: pandas.core.frame.DataFrame, tag_column: str, class_column: str, tag2class: Dict[str, int], verbose: bool = False) → pandas.core.frame.DataFrame[source]

This function maps tags from tag_column to ints into class_column using tag2class dictionary.

Parameters
  • dataframe (pd.DataFrame) – input dataframe

  • tag_column (str) – column with tags

  • class_column (str) –

  • tag2class (Dict[str, int]) – mapping from tags to class labels

  • verbose – flag if true, uses tqdm

Returns

updated dataframe with class_column

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.get_dataset_labeling(dataframe: pandas.core.frame.DataFrame, tag_column: str) → Dict[str, int][source]

Prepares a mapping using unique values from tag_column.

{
    "class_name_0": 0,
    "class_name_1": 1,
    ...
    "class_name_N": N
}
Parameters
  • dataframe – a dataset

  • tag_column – which column to use

Returns

mapping from tag to labels

Return type

Dict[str, int]

catalyst.contrib.utils.pandas.merge_multiple_fold_csv(fold_name: str, paths: Optional[str]) → pandas.core.frame.DataFrame[source]

Reads csv into one DataFrame with column fold.

Parameters
  • fold_name (str) – current fold name

  • paths (str) – paths to csv separated by commas

Returns

merged dataframes with column fold == fold_name

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.read_csv_data(in_csv: str = None, train_folds: Optional[List[int]] = None, valid_folds: Optional[List[int]] = None, infer_folds: Optional[List[int]] = None, seed: int = 42, n_folds: int = 5, in_csv_train: str = None, in_csv_valid: str = None, in_csv_infer: str = None, tag2class: Optional[Dict[str, int]] = None, class_column: str = None, tag_column: str = None) → Tuple[pandas.core.frame.DataFrame, List[dict], List[dict], List[dict]][source]

From giving path in_csv reads a dataframe and split it to train/valid/infer folds or from several paths in_csv_train, in_csv_valid, in_csv_infer reads independent folds.

Note

This function can be used with different combinations of params.
First block is used to get dataset from one csv:

in_csv, train_folds, valid_folds, infer_folds, seed, n_folds

Second includes paths to different csv for train/valid and infer parts:

in_csv_train, in_csv_valid, in_csv_infer

The other params (tag2class, tag_column, class_column) are optional

for any previous block

Parameters
  • in_csv (str) – paths to whole dataset

  • train_folds (List[int]) – train folds

  • valid_folds (List[int], optional) – valid folds. If none takes all folds not included in train_folds

  • infer_folds (List[int], optional) – infer folds. If none takes all folds not included in train_folds and valid_folds

  • seed (int) – seed for split

  • n_folds (int) – number of folds

  • in_csv_train (str) – paths to train csv separated by commas

  • in_csv_valid (str) – paths to valid csv separated by commas

  • in_csv_infer (str) – paths to infer csv separated by commas

  • tag2class (Dict[str, int]) – mapping from label names into ints

  • tag_column (str) – column with label names

  • class_column (str) – column to use for split

Returns

tuple with 4 elements (whole dataframe, list with train data, list with valid data and list with infer data)

Return type

(Tuple[pd.DataFrame, List[dict], List[dict], List[dict]])

catalyst.contrib.utils.pandas.balance_classes(dataframe: pandas.core.frame.DataFrame, class_column: str = 'label', random_state: int = 42, how: str = 'downsampling') → pandas.core.frame.DataFrame[source]

Balance classes in dataframe by class_column.

See also catalyst.data.sampler.BalanceClassSampler.

Parameters
  • dataframe – a dataset

  • class_column – which column to use for split

  • random_state – seed for random shuffle

  • how – strategy to sample must be one on [“downsampling”, “upsampling”]

Returns

new dataframe with balanced class_column

Return type

pd.DataFrame

Parallel

catalyst.contrib.utils.parallel.parallel_imap(func, args, pool: Union[multiprocessing.pool.Pool, catalyst.contrib.utils.parallel.DumbPool]) → List[T][source]

@TODO: Docs. Contribution is welcome.

catalyst.contrib.utils.parallel.tqdm_parallel_imap(func, args, pool: Union[multiprocessing.pool.Pool, catalyst.contrib.utils.parallel.DumbPool], total: int = None, pbar=<class 'tqdm.std.tqdm'>) → List[T][source]

@TODO: Docs. Contribution is welcome.

catalyst.contrib.utils.parallel.get_pool(workers: int) → Union[multiprocessing.pool.Pool, catalyst.contrib.utils.parallel.DumbPool][source]

@TODO: Docs. Contribution is welcome.

Plotly

catalyst.contrib.utils.plotly.plot_tensorboard_log(logdir: Union[str, pathlib.Path], step: Optional[str] = 'batch', metrics: Optional[List[str]] = None, height: Optional[int] = None, width: Optional[int] = None) → None[source]

@TODO: Docs. Contribution is welcome.

Serialization

catalyst.contrib.utils.serialization.serialize(data)

Serialize the data into bytes using pickle.

Parameters

data – a value

Returns

Returns a bytes object serialized with pickle data.

catalyst.contrib.utils.serialization.deserialize(data)

Deserialize bytes into an object using pickle.

Parameters

bytes – a bytes object containing serialized with pickle data.

Returns

Returns a value deserialized from the bytes-like object.

Text

catalyst.contrib.utils.text.tokenize_text(text: str, tokenizer, max_length: int, strip: bool = True, lowercase: bool = True, remove_punctuation: bool = True) → Dict[str, numpy.array][source]

Tokenizes givin text.

Parameters
  • text (str) – text to tokenize

  • tokenizer – Tokenizer instance from HuggingFace

  • max_length (int) – maximum length of tokens

  • strip (bool) – if true strips text before tokenizing

  • lowercase (bool) – if true makes text lowercase before tokenizing

  • remove_punctuation (bool) – if true removes string.punctuation from text before tokenizing

catalyst.contrib.utils.text.process_bert_output(bert_output, hidden_size: int, output_hidden_states: bool = False, pooling_groups: List[str] = None, mask: torch.Tensor = None, level: Union[int, str] = None)[source]

Processed the output.

Visualization

catalyst.contrib.utils.visualization.plot_confusion_matrix(cm, class_names=None, normalize=False, title='confusion matrix', fname=None, show=True, figsize=12, fontsize=32, colormap='Blues')[source]

Render the confusion matrix and return matplotlib”s figure with it. Normalization can be applied by setting normalize=True.

catalyst.contrib.utils.visualization.render_figure_to_tensor(figure)[source]

@TODO: Docs. Contribution is welcome.

catalyst.contrib.utils.visualization.plot_metrics(logdir: Union[str, pathlib.Path], step: Optional[str] = 'epoch', metrics: Optional[List[str]] = None, height: Optional[int] = None, width: Optional[int] = None) → None[source]

Plots your learning results.

Parameters
  • logdir – the logdir that was specified during training.

  • step – ‘batch’ or ‘epoch’ - what logs to show: for batches or for epochs

  • metrics – list of metrics to plot. The loss should be specified as ‘loss’, learning rate = ‘_base/lr’ and other metrics should be specified as names in metrics dict that was specified during training

  • height – the height of the whole resulting plot

  • width – the width of the whole resulting plot

Tools

Tensorboard

Tensorboard readers:
exception catalyst.contrib.utils.tools.tensorboard.EventReadingException[source]

Bases: Exception

An exception that correspond to an event file reading error.

class catalyst.contrib.utils.tools.tensorboard.EventsFileReader(events_file: BinaryIO)[source]

Bases: collections.abc.Iterable

An iterator over a Tensorboard events file.

__init__(events_file: BinaryIO)[source]

Initialize an iterator over an events file.

Parameters

events_file – An opened file-like object.

class catalyst.contrib.utils.tools.tensorboard.SummaryItem(tag, step, wall_time, value, type)

Bases: tuple

property step

Alias for field number 1

property tag

Alias for field number 0

property type

Alias for field number 4

property value

Alias for field number 3

property wall_time

Alias for field number 2

class catalyst.contrib.utils.tools.tensorboard.SummaryReader(logdir: Union[str, pathlib.Path], tag_filter: Optional[collections.abc.Iterable] = None, types: collections.abc.Iterable = ('scalar',))[source]

Bases: collections.abc.Iterable

Iterates over events in all the files in the current logdir.

Note

Only scalars and images are supported at the moment.

__init__(logdir: Union[str, pathlib.Path], tag_filter: Optional[collections.abc.Iterable] = None, types: collections.abc.Iterable = ('scalar',))[source]

Initalize new summary reader.

Parameters
  • logdir – A directory with Tensorboard summary data

  • tag_filter – A list of tags to leave (None for all)

  • types – A list of types to get.

  • "scalar" and "image" types are allowed at the moment. (Only) –