Contrib¶

NN ¶

Criterion ¶

class catalyst.contrib.nn.criterion.ce.MaskCrossEntropyLoss(*args, target_name: str = 'targets', mask_name: str = 'mask', **kwargs)[source]¶

Bases: torch.nn.modules.loss.CrossEntropyLoss

@TODO: Docs. Contribution is welcome.

__init__(*args, target_name: str = 'targets', mask_name: str = 'mask', **kwargs)[source]¶: @TODO: Docs. Contribution is welcome.

forward(input: torch.Tensor, target_mask: torch.Tensor) → torch.Tensor[source]¶

Calculates loss between input and target_mask tensors.

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.ce.SymmetricCrossEntropyLoss(alpha: float = 1.0, beta: float = 1.0)[source]¶

Bases: torch.nn.modules.module.Module

The Symmetric Cross Entropy loss.

It has been proposed in Symmetric Cross Entropy for Robust Learning with Noisy Labels.

__init__(alpha: float = 1.0, beta: float = 1.0)[source]¶

Parameters

alpha (float) – corresponds to overfitting issue of CE
beta (float) – corresponds to flexible exploration on the robustness of RCE

forward(input: torch.Tensor, target: torch.Tensor) → torch.Tensor[source]¶

Calculates loss between input and target tensors.

Parameters

input (torch.Tensor) – input tensor of size (batch_size, num_classes)
target (torch.Tensor) – target tensor of size (batch_size), where values of a vector correspond to class index

class catalyst.contrib.nn.criterion.ce.NaiveCrossEntropyLoss(size_average=True)[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(size_average=True)[source]¶: @TODO: Docs. Contribution is welcome.

forward(input: torch.Tensor, target: torch.Tensor) → torch.Tensor[source]¶

Calculates loss between input and target tensors.

Parameters

input (torch.Tensor) – input tensor of shape …
target (torch.Tensor) – target tensor of shape …

@TODO: Docs (add shapes). Contribution is welcome.

class catalyst.contrib.nn.criterion.contrastive.ContrastiveEmbeddingLoss(margin=1.0, reduction='mean')[source]¶

Bases: torch.nn.modules.module.Module

The Contrastive embedding loss.

It has been proposed in Dimensionality Reduction by Learning an Invariant Mapping.

__init__(margin=1.0, reduction='mean')[source]¶

Parameters

margin – margin parameter
reduction – criterion reduction type

forward(embeddings_left: torch.Tensor, embeddings_right: torch.Tensor, distance_true) → torch.Tensor[source]¶

Forward propagation method for the contrastive loss.

Parameters

embeddings_left (torch.Tensor) – left objects embeddings
embeddings_right (torch.Tensor) – right objects embeddings
distance_true – true distances

Returns

loss

Return type

torch.Tensor

class catalyst.contrib.nn.criterion.contrastive.ContrastiveDistanceLoss(margin=1.0, reduction='mean')[source]¶

Bases: torch.nn.modules.module.Module

The Contrastive distance loss.

@TODO: Docs. Contribution is welcome.

__init__(margin=1.0, reduction='mean')[source]¶

Parameters

margin – margin parameter
reduction (str) – criterion reduction type

forward(distance_pred, distance_true) → torch.Tensor[source]¶

Forward propagation method for the contrastive loss.

Parameters

distance_pred – predicted distances
distance_true – true distances

Returns

loss

Return type

torch.Tensor

class catalyst.contrib.nn.criterion.contrastive.ContrastivePairwiseEmbeddingLoss(margin=1.0, reduction='mean')[source]¶

Bases: torch.nn.modules.module.Module

ContrastivePairwiseEmbeddingLoss – proof of concept criterion.

Still work in progress.

@TODO: Docs. Contribution is welcome.

__init__(margin=1.0, reduction='mean')[source]¶

Parameters

margin – margin parameter
reduction – criterion reduction type

forward(embeddings_pred, embeddings_true) → torch.Tensor[source]¶

Forward propagation method for the contrastive loss.

Work in progress.

Parameters

embeddings_pred – predicted embeddings
embeddings_true – true embeddings

Returns

loss

Return type

torch.Tensor

class catalyst.contrib.nn.criterion.dice.BCEDiceLoss(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid', bce_weight: float = 0.5, dice_weight: float = 0.5)[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid', bce_weight: float = 0.5, dice_weight: float = 0.5)[source]¶: @TODO: Docs. Contribution is welcome.

forward(outputs, targets)[source]¶: @TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.dice.DiceLoss(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]¶: @TODO: Docs. Contribution is welcome.

forward(logits: torch.Tensor, targets: torch.Tensor)[source]¶

Calculates loss between logits and target tensors.

@TODO: Docs. Contribution is welcome

class catalyst.contrib.nn.criterion.focal.FocalLossBinary(ignore: int = None, reduced: bool = False, gamma: float = 2.0, alpha: float = 0.25, threshold: float = 0.5, reduction: str = 'mean')[source]¶

Bases: torch.nn.modules.loss._Loss

Compute focal loss for binary classification problem.

It has been proposed in Focal Loss for Dense Object Detection paper.

@TODO: Docs (add Example). Contribution is welcome.

__init__(ignore: int = None, reduced: bool = False, gamma: float = 2.0, alpha: float = 0.25, threshold: float = 0.5, reduction: str = 'mean')[source]¶: @TODO: Docs. Contribution is welcome.

forward(logits, targets)[source]¶

Parameters

logits – [bs; …]
targets – [bs; …]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.focal.FocalLossMultiClass(ignore: int = None, reduced: bool = False, gamma: float = 2.0, alpha: float = 0.25, threshold: float = 0.5, reduction: str = 'mean')[source]¶

Bases: catalyst.contrib.nn.criterion.focal.FocalLossBinary

Compute focal loss for multi-class problem. Ignores targets having -1 label.

It has been proposed in Focal Loss for Dense Object Detection paper.

@TODO: Docs (add Example). Contribution is welcome.

forward(logits, targets)[source]¶

Parameters

logits – [bs; num_classes; …]
targets – [bs; …]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.gan.MeanOutputLoss[source]¶

Bases: torch.nn.modules.module.Module

Criterion to compute simple mean of the output, completely ignoring target (maybe useful e.g. for WGAN real/fake validity averaging.

forward(output, target)[source]¶

Compute criterion.

@TODO: Docs (add typing). Contribution is welcome.

class catalyst.contrib.nn.criterion.gan.GradientPenaltyLoss[source]¶

Bases: torch.nn.modules.module.Module

Criterion to compute gradient penalty.

WARN: SHOULD NOT BE RUN WITH CriterionCallback,: use special GradientPenaltyCallback instead

forward(fake_data, real_data, critic, critic_condition_args)[source]¶

Compute gradient penalty.

Parameters: @TODO – Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.huber.HuberLoss(clip_delta=1.0, reduction='mean')[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(clip_delta=1.0, reduction='mean')[source]¶: @TODO: Docs. Contribution is welcome.

forward(y_pred: torch.Tensor, y_true: torch.Tensor, weights=None) → torch.Tensor[source]¶: @TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.iou.IoULoss(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]¶

Bases: torch.nn.modules.module.Module

The intersection over union (Jaccard) loss.

@TODO: Docs. Contribution is welcome.

__init__(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]¶

Parameters

eps (float) – epsilon to avoid zero division
threshold (float) – threshold for outputs binarization
activation (str) – An torch.nn activation applied to the outputs. Must be one of 'none', 'Sigmoid', 'Softmax2d'

forward(outputs, targets)[source]¶: @TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.iou.BCEIoULoss(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid', reduction: str = 'mean')[source]¶

Bases: torch.nn.modules.module.Module

The Intersection over union (Jaccard) with BCE loss.

@TODO: Docs. Contribution is welcome.

__init__(eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid', reduction: str = 'mean')[source]¶

Parameters

eps (float) – epsilon to avoid zero division
threshold (float) – threshold for outputs binarization
activation (str) – An torch.nn activation applied to the outputs. Must be one of 'none', 'Sigmoid', 'Softmax2d'
reduction (str) – Specifies the reduction to apply to the output of BCE

forward(outputs, targets)[source]¶: @TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.lovasz.LovaszLossBinary(per_image=False, ignore=None)[source]¶

Bases: torch.nn.modules.loss._Loss

Creates a criterion that optimizes a binary Lovasz loss.

It has been proposed in The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks.

__init__(per_image=False, ignore=None)[source]¶: @TODO: Docs. Contribution is welcome.

forward(logits, targets)[source]¶

Forward propagation method for the Lovasz loss.

Parameters

logits – [bs; …]
targets – [bs; …]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.lovasz.LovaszLossMultiClass(per_image=False, ignore=None)[source]¶

Bases: torch.nn.modules.loss._Loss

Creates a criterion that optimizes a multi-class Lovasz loss.

It has been proposed in The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks.

__init__(per_image=False, ignore=None)[source]¶: @TODO: Docs. Contribution is welcome.

forward(logits, targets)[source]¶

Forward propagation method for the Lovasz loss.

Parameters

logits – [bs; num_classes; …]
targets – [bs; …]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.lovasz.LovaszLossMultiLabel(per_image=False, ignore=None)[source]¶

Bases: torch.nn.modules.loss._Loss

Creates a criterion that optimizes a multi-label Lovasz loss.

It has been proposed in The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks.

__init__(per_image=False, ignore=None)[source]¶: @TODO: Docs. Contribution is welcome.

forward(logits, targets)[source]¶

Forward propagation method for the Lovasz loss.

Parameters

logits – [bs; num_classes; …]
targets – [bs; num_classes; …]

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.margin.MarginLoss(alpha: float = 0.2, beta: float = 1.0, skip_labels: Union[int, List[int]] = -1)[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(alpha: float = 0.2, beta: float = 1.0, skip_labels: Union[int, List[int]] = -1)[source]¶

Parameters

alpha (float) –
beta (float) –
skip_labels (int or List[int]) –

@TODO: Docs. Contribution is welcome.

forward(embeddings: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]¶

Forward propagation method for the margin loss.

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.criterion.triplet.TripletLoss(margin: float = 0.3)[source]¶

Bases: torch.nn.modules.module.Module

Triplet loss with hard positive/negative mining.

Reference:: Code imported from https://github.com/NegatioN/OnlineMiningTripletLoss

__init__(margin: float = 0.3)[source]¶

Parameters: margin (float) – margin for triplet

forward(embeddings, targets)[source]¶

Forward propagation method for the triplet loss.

Parameters

embeddings – tensor of shape (batch_size, embed_dim)
targets – labels of the batch, of size (batch_size,)

Returns

scalar tensor containing the triplet loss

Return type

triplet_loss

class catalyst.contrib.nn.criterion.triplet.TripletPairwiseEmbeddingLoss(margin: float = 0.3, reduction: str = 'mean')[source]¶

Bases: torch.nn.modules.module.Module

TripletPairwiseEmbeddingLoss – proof of concept criterion.

Still work in progress.

@TODO: Docs. Contribution is welcome.

__init__(margin: float = 0.3, reduction: str = 'mean')[source]¶

Parameters

margin (float) – margin parameter
reduction (str) – criterion reduction type

forward(embeddings_pred, embeddings_true)[source]¶

Work in progress.

Parameters

embeddings_pred – predicted embeddings with shape [batch_size, embedding_size]
embeddings_true – true embeddings with shape [batch_size, embedding_size]

Returns

loss

Return type

torch.Tensor

class catalyst.contrib.nn.criterion.wing.WingLoss(width: int = 5, curvature: float = 0.5, reduction: str = 'mean')[source]¶

Bases: torch.nn.modules.module.Module

Creates a criterion that optimizes a Wing loss.

It has been proposed in Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks.

Examples

@TODO: Docs. Contribution is welcome.

Main origins of inspiration:: https://github.com/BloodAxe/pytorch-toolbelt

__init__(width: int = 5, curvature: float = 0.5, reduction: str = 'mean')[source]¶

Parameters: @TODO – Docs. Contribution is welcome.

forward(outputs: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]¶

Parameters: @TODO – Docs. Contribution is welcome.

Modules ¶

class catalyst.contrib.nn.modules.common.Flatten[source]¶

Bases: torch.nn.modules.module.Module

Flattens the input. Does not affect the batch size.

@TODO: Docs (add Example). Contribution is welcome.

__init__()[source]¶: @TODO: Docs. Contribution is welcome.

forward(x)[source]¶: Forward call.

class catalyst.contrib.nn.modules.common.Lambda(lambda_fn)[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(lambda_fn)[source]¶: @TODO: Docs. Contribution is welcome.

forward(x)[source]¶: Forward call.

class catalyst.contrib.nn.modules.common.Normalize(**normalize_kwargs)[source]¶

Bases: torch.nn.modules.module.Module

Performs \(L_p\) normalization of inputs over specified dimension.

@TODO: Docs (add Example). Contribution is welcome.

__init__(**normalize_kwargs)[source]¶

Parameters: **normalize_kwargs – see torch.nn.functional.normalize params

forward(x)[source]¶: Forward call.

class catalyst.contrib.nn.modules.lama.TemporalLastPooling[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]¶: Forward call.

class catalyst.contrib.nn.modules.lama.TemporalAvgPooling[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]¶: Forward call.

class catalyst.contrib.nn.modules.lama.TemporalMaxPooling[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]¶: Forward call.

class catalyst.contrib.nn.modules.lama.TemporalDropLastWrapper(net)[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(net)[source]¶: @TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None)[source]¶: @TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.modules.lama.TemporalAttentionPooling(in_features, activation=None, kernel_size=1, **params)[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(in_features, activation=None, kernel_size=1, **params)[source]¶: @TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]¶

Parameters: x (torch.Tensor) – tensor of size (batch_size, history_len, feature_size)

@TODO: Docs. Contribution is welcome.

name2activation = {'sigmoid': Sigmoid(), 'softmax': Softmax(dim=1), 'tanh': Tanh()}¶

class catalyst.contrib.nn.modules.lama.TemporalConcatPooling(in_features, history_len=1)[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(in_features, history_len=1)[source]¶: @TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]¶

Parameters: x (torch.Tensor) – tensor of size (batch_size, history_len, feature_size)

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.modules.lama.LamaPooling(in_features, groups=None)[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(in_features, groups=None)[source]¶: @TODO: Docs. Contribution is welcome.

available_groups = ['last', 'avg', 'avg_droplast', 'max', 'max_droplast', 'sigmoid', 'sigmoid_droplast', 'softmax', 'softmax_droplast', 'tanh', 'tanh_droplast']¶

forward(x: torch.Tensor, mask: torch.Tensor = None) → torch.Tensor[source]¶

Parameters: x (torch.Tensor) – tensor of size (batch_size, history_len, feature_size)

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.modules.pooling.GlobalAttnPool2d(in_features, activation_fn='Sigmoid')[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs. Contribution is welcome.

__init__(in_features, activation_fn='Sigmoid')[source]¶: @TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor) → torch.Tensor[source]¶: Forward call.

static out_features(in_features)[source]¶

Returns number of channels produced by the pooling.

Parameters: in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalAvgAttnPool2d(in_features, activation_fn='Sigmoid')[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs (add Example). Contribution is welcome.

__init__(in_features, activation_fn='Sigmoid')[source]¶: @TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor) → torch.Tensor[source]¶: Forward call.

static out_features(in_features)[source]¶

Returns number of channels produced by the pooling.

Parameters: in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalAvgPool2d[source]¶

Bases: torch.nn.modules.module.Module

Applies a 2D global average pooling operation over an input signal composed of several input planes.

@TODO: Docs (add Example). Contribution is welcome.

__init__()[source]¶: Constructor method for the GlobalAvgPool2d class.

forward(x: torch.Tensor) → torch.Tensor[source]¶: Forward call.

static out_features(in_features)[source]¶

Returns number of channels produced by the pooling.

Parameters: in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalConcatAttnPool2d(in_features, activation_fn='Sigmoid')[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs (add Example). Contribution is welcome.

__init__(in_features, activation_fn='Sigmoid')[source]¶: @TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor) → torch.Tensor[source]¶: Forward call.

static out_features(in_features)[source]¶

Returns number of channels produced by the pooling.

Parameters: in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalConcatPool2d[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs (add Example). Contribution is welcome.

__init__()[source]¶: Constructor method for the GlobalConcatPool2d class.

forward(x: torch.Tensor) → torch.Tensor[source]¶: Forward call.

static out_features(in_features)[source]¶

Returns number of channels produced by the pooling.

Parameters: in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalMaxAttnPool2d(in_features, activation_fn='Sigmoid')[source]¶

Bases: torch.nn.modules.module.Module

@TODO: Docs (add Example). Contribution is welcome.

__init__(in_features, activation_fn='Sigmoid')[source]¶: @TODO: Docs. Contribution is welcome.

forward(x: torch.Tensor) → torch.Tensor[source]¶: Forward call.

static out_features(in_features)[source]¶

Returns number of channels produced by the pooling.

Parameters: in_features – number of channels in the input sample

class catalyst.contrib.nn.modules.pooling.GlobalMaxPool2d[source]¶

Bases: torch.nn.modules.module.Module

Applies a 2D global max pooling operation over an input signal composed of several input planes.

@TODO: Docs (add Example). Contribution is welcome.

__init__()[source]¶: Constructor method for the GlobalMaxPool2d class.

forward(x: torch.Tensor) → torch.Tensor[source]¶: Forward call.

static out_features(in_features)[source]¶

Returns number of channels produced by the pooling.

Parameters: in_features – number of channels in the input sample

Optimizers ¶

class catalyst.contrib.nn.optimizers.lamb.Lamb(params, lr: Optional[float] = 0.001, betas: Optional[Tuple[float, float]] = (0.9, 0.999), eps: Optional[float] = 1e-06, weight_decay: Optional[float] = 0.0, adam: Optional[bool] = False)[source]¶

Bases: torch.optim.optimizer.Optimizer

Implements Lamb algorithm.

It has been proposed in Training BERT in 76 minutes.

__init__(params, lr: Optional[float] = 0.001, betas: Optional[Tuple[float, float]] = (0.9, 0.999), eps: Optional[float] = 1e-06, weight_decay: Optional[float] = 0.0, adam: Optional[bool] = False)[source]¶

Parameters

params (iterable) – iterable of parameters to optimize or dicts defining parameter groups
lr (float, optional) – learning rate (default: 1e-3)
betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)
adam (bool, optional) – always use trust ratio = 1, which turns this into Adam. Useful for comparison purposes.

step(closure: Optional[Callable] = None)[source]¶

Makes optimizer step.

Parameters: closure (callable, optional) – A closure that reevaluates the model and returns the loss.

catalyst.contrib.nn.optimizers.lamb.log_lamb_rs(optimizer: torch.optim.optimizer.Optimizer, event_writer, token_count: int)[source]¶: Log a histogram of trust ratio scalars in across layers.

class catalyst.contrib.nn.optimizers.lookahead.Lookahead(optimizer: torch.optim.optimizer.Optimizer, k: int = 5, alpha: float = 0.5)[source]¶

Bases: torch.optim.optimizer.Optimizer

Implements Lookahead algorithm.

It has been proposed in Lookahead Optimizer: k steps forward, 1 step back.

Main origins of inspiration:: https://github.com/alphadl/lookahead.pytorch (MIT License)

__init__(optimizer: torch.optim.optimizer.Optimizer, k: int = 5, alpha: float = 0.5)[source]¶: @TODO: Docs. Contribution is welcome.

add_param_group(param_group)[source]¶: @TODO: Docs. Contribution is welcome.

classmethod get_from_params(params: Dict, base_optimizer_params: Dict = None, **kwargs) → catalyst.contrib.nn.optimizers.lookahead.Lookahead[source]¶: @TODO: Docs. Contribution is welcome.

load_state_dict(state_dict)[source]¶: @TODO: Docs. Contribution is welcome.

state_dict()[source]¶: @TODO: Docs. Contribution is welcome.

step(closure: Optional[Callable] = None)[source]¶

Makes optimizer step.

Parameters: closure (callable, optional) – A closure that reevaluates the model and returns the loss.

update(group)[source]¶: @TODO: Docs. Contribution is welcome.

update_lookahead()[source]¶: @TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.optimizers.qhadamw.QHAdamW(params, lr=0.001, betas=(0.995, 0.999), nus=(0.7, 1.0), weight_decay=0.0, eps=1e-08)[source]¶

Bases: torch.optim.optimizer.Optimizer

Implements QHAdam algorithm.

Combines QHAdam algorithm that was proposed in Quasi-hyperbolic momentum and Adam for deep learning with weight decay decoupling from Decoupled Weight Decay Regularization paper.

Example

>>> optimizer = QHAdamW(
...     model.parameters(),
...     lr=3e-4, nus=(0.8, 1.0), betas=(0.99, 0.999))
>>> optimizer.zero_grad()
>>> loss_fn(model(input), target).backward()
>>> optimizer.step()

Main origins of inspiration:: https://github.com/iprally/qhadamw-pytorch/blob/master/qhadamw.py (MIT License)

__init__(params, lr=0.001, betas=(0.995, 0.999), nus=(0.7, 1.0), weight_decay=0.0, eps=1e-08)[source]¶

Parameters

params (iterable) – iterable of parameters to optimize or dicts defining parameter groups
lr (float, optional) – learning rate (\(\alpha\) from the paper) (default: 1e-3)
betas (Tuple[float, float], optional) – coefficients used for computing running averages of the gradient and its square (default: (0.995, 0.999))
nus (Tuple[float, float], optional) – immediate discount factors used to estimate the gradient and its square (default: (0.7, 1.0))
eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
weight_decay (float, optional) – weight decay (L2 regularization coefficient, times two) (default: 0.0)

step(closure: Optional[Callable] = None)[source]¶

Makes optimizer step.

Parameters: closure (callable, optional) – A closure that reevaluates the model and returns the loss.

class catalyst.contrib.nn.optimizers.radam.RAdam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)[source]¶

Bases: torch.optim.optimizer.Optimizer

Implements RAdam algorithm.

It has been proposed in On the Variance of the Adaptive Learning Rate and Beyond.

@TODO: Docs (add Example). Contribution is welcome

Main origins of inspiration:: https://github.com/LiyuanLucasLiu/RAdam (Apache-2.0 License)

__init__(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)[source]¶: @TODO: Docs. Contribution is welcome.

step(closure: Optional[Callable] = None)[source]¶

Makes optimizer step.

Parameters: closure (callable, optional) – A closure that reevaluates the model and returns the loss.

class catalyst.contrib.nn.optimizers.ralamb.Ralamb(params: Iterable, lr: float = 0.001, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0)[source]¶

Bases: torch.optim.optimizer.Optimizer

RAdam optimizer with LARS/LAMB tricks.

Main origins of inspiration:: https://github.com/mgrankin/over9000/blob/master/ralamb.py (Apache-2.0 License)

__init__(params: Iterable, lr: float = 0.001, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0)[source]¶

Parameters

params (iterable) – iterable of parameters to optimize or dicts defining parameter groups
lr (float, optional) – learning rate (default: 1e-3)
betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)

step(closure: Optional[Callable] = None)[source]¶

Makes optimizer step.

Parameters: closure (callable, optional) – A closure that reevaluates the model and returns the loss.

Schedulers ¶

class catalyst.contrib.nn.schedulers.base.BaseScheduler(optimizer, last_epoch=-1)[source]¶

Bases: torch.optim.lr_scheduler._LRScheduler, abc.ABC

Base class for all schedulers with momentum update.

get_momentum() → List[float][source]¶

Function that returns the new momentum for optimizer.

Returns: calculated momentum for every param groups
Return type: List[float]

step(epoch: Optional[int] = None) → None[source]¶

Make one scheduler step.

Parameters: epoch (int, optional) – current epoch num

class catalyst.contrib.nn.schedulers.base.BatchScheduler(optimizer, last_epoch=-1)[source]¶

Bases: catalyst.contrib.nn.schedulers.base.BaseScheduler, abc.ABC

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.nn.schedulers.onecycle.OneCycleLRWithWarmup(optimizer: torch.optim.optimizer.Optimizer, num_steps: int, lr_range=(1.0, 0.005), init_lr: float = None, warmup_steps: int = 0, warmup_fraction: float = None, decay_steps: int = 0, decay_fraction: float = None, momentum_range=(0.8, 0.99, 0.999), init_momentum: float = None)[source]¶

Bases: catalyst.contrib.nn.schedulers.base.BatchScheduler

OneCycle scheduler with warm-up & lr decay stages.

First stage increases lr from init_lr to max_lr, and called warmup. Also it decreases momentum from init_momentum to min_momentum. Takes warmup_steps steps

Second is annealing stage. Decrease lr from max_lr to min_lr, Increase momentum from min_momentum to max_momentum.

Third, optional, lr decay.

__init__(optimizer: torch.optim.optimizer.Optimizer, num_steps: int, lr_range=(1.0, 0.005), init_lr: float = None, warmup_steps: int = 0, warmup_fraction: float = None, decay_steps: int = 0, decay_fraction: float = None, momentum_range=(0.8, 0.99, 0.999), init_momentum: float = None)[source]¶

Parameters

optimizer – PyTorch optimizer
num_steps (int) – total number of steps
lr_range – tuple with two or three elements (max_lr, min_lr, [final_lr])
init_lr (float, optional) – initial lr
warmup_steps (int) – count of steps for warm-up stage
warmup_fraction (float, optional) – fraction in [0; 1) to calculate number of warmup steps. Cannot be set together with warmup_steps
decay_steps (int) – count of steps for lr decay stage
decay_fraction (float, optional) – fraction in [0; 1) to calculate number of decay steps. Cannot be set together with decay_steps
momentum_range – tuple with two or three elements (min_momentum, max_momentum, [final_momentum])
init_momentum (float, optional) – initial momentum

get_lr() → List[float][source]¶

Function that returns the new lr for optimizer.

Returns: calculated lr for every param groups
Return type: List[float]

get_momentum() → List[float][source]¶

Function that returns the new momentum for optimizer.

Returns: calculated momentum for every param groups
Return type: List[float]

recalculate(loader_len: int, current_step: int) → None[source]¶

Recalculates total num_steps for batch mode.

Parameters

loader_len (int) – total count of batches in an epoch
current_step (int) – current step

reset()[source]¶: @TODO: Docs. Contribution is welcome.

Models ¶

Segmentation ¶

class catalyst.contrib.models.cv.segmentation.unet.ResnetUnet(num_classes: int = 1, arch: str = 'resnet18', pretrained: bool = True, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]¶

Bases: catalyst.contrib.models.cv.segmentation.core.ResnetUnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.unet.Unet(num_classes: int = 1, in_channels: int = 3, num_channels: int = 32, num_blocks: int = 4, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]¶

Bases: catalyst.contrib.models.cv.segmentation.core.UnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.linknet.Linknet(num_classes: int = 1, in_channels: int = 3, num_channels: int = 32, num_blocks: int = 4, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]¶

Bases: catalyst.contrib.models.cv.segmentation.core.UnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.linknet.ResnetLinknet(num_classes: int = 1, arch: str = 'resnet18', pretrained: bool = True, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]¶

Bases: catalyst.contrib.models.cv.segmentation.core.ResnetUnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.fpn.FPNUnet(num_classes: int = 1, in_channels: int = 3, num_channels: int = 32, num_blocks: int = 4, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]¶

Bases: catalyst.contrib.models.cv.segmentation.core.UnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.fpn.ResnetFPNUnet(num_classes: int = 1, arch: str = 'resnet18', pretrained: bool = True, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]¶

Bases: catalyst.contrib.models.cv.segmentation.core.ResnetUnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.psp.PSPnet(num_classes: int = 1, in_channels: int = 3, num_channels: int = 32, num_blocks: int = 4, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]¶

Bases: catalyst.contrib.models.cv.segmentation.core.UnetSpec

@TODO: Docs. Contribution is welcome.

class catalyst.contrib.models.cv.segmentation.psp.ResnetPSPnet(num_classes: int = 1, arch: str = 'resnet18', pretrained: bool = True, encoder_params: Dict = None, bridge_params: Dict = None, decoder_params: Dict = None, head_params: Dict = None, state_dict: Union[dict, str, pathlib.Path] = None)[source]¶

Bases: catalyst.contrib.models.cv.segmentation.core.ResnetUnetSpec

@TODO: Docs. Contribution is welcome.

Registry ¶

catalyst subpackage registries

catalyst.contrib.registry.Criterion(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]¶

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters

factory – Factory instance
factories – More instances
name – Provided name for first instance. Use only when pass single instance.
named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Optimizer(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]¶

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters

factory – Factory instance
factories – More instances
name – Provided name for first instance. Use only when pass single instance.
named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Scheduler(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]¶

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters

factory – Factory instance
factories – More instances
name – Provided name for first instance. Use only when pass single instance.
named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Module(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]¶

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters

factory – Factory instance
factories – More instances
name – Provided name for first instance. Use only when pass single instance.
named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Model(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]¶

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters

factory – Factory instance
factories – More instances
name – Provided name for first instance. Use only when pass single instance.
named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Sampler(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]¶

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters

factory – Factory instance
factories – More instances
name – Provided name for first instance. Use only when pass single instance.
named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Transform(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]¶

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters

factory – Factory instance
factories – More instances
name – Provided name for first instance. Use only when pass single instance.
named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

catalyst.contrib.registry.Experiment(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]]¶

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters

factory – Factory instance
factories – More instances
name – Provided name for first instance. Use only when pass single instance.
named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

Utilities ¶

Argparse ¶

catalyst.contrib.utils.argparse.boolean_flag(parser: argparse.ArgumentParser, name: str, default: Optional[bool] = False, help: str = None, shorthand: str = None) → None[source]¶

Add a boolean flag to a parser inplace.

Examples

>>> parser = argparse.ArgumentParser()
>>> boolean_flag(
>>>     parser, "flag", default=False, help="some flag", shorthand="f"
>>> )

Parameters

parser (argparse.ArgumentParser) – parser to add the flag to
name (str) – argument name –<name> will enable the flag, while –no-<name> will disable it
default (bool, optional) – default value of the flag
help (str) – help string for the flag
shorthand (str) – shorthand string for the argument

Compression ¶

catalyst.contrib.utils.compression.pack(data)¶

Serialize the data into bytes using pickle.

Parameters: data – a value
Returns: Returns a bytes object serialized with pickle data.

catalyst.contrib.utils.compression.pack_if_needed(data)¶

Serialize the data into bytes using pickle.

Parameters: data – a value
Returns: Returns a bytes object serialized with pickle data.

catalyst.contrib.utils.compression.unpack(data)¶

Deserialize bytes into an object using pickle.

Parameters: bytes – a bytes object containing serialized with pickle data.
Returns: Returns a value deserialized from the bytes-like object.

catalyst.contrib.utils.compression.unpack_if_needed(data)¶

Deserialize bytes into an object using pickle.

Parameters: bytes – a bytes object containing serialized with pickle data.
Returns: Returns a value deserialized from the bytes-like object.

Confusion Matrix ¶

catalyst.contrib.utils.confusion_matrix.calculate_tp_fp_fn(confusion_matrix: numpy.ndarray) → numpy.ndarray[source]¶: @TODO: Docs. Contribution is welcome.

catalyst.contrib.utils.confusion_matrix.calculate_confusion_matrix_from_arrays(ground_truth: numpy.ndarray, prediction: numpy.ndarray, num_classes: int) → numpy.ndarray[source]¶

Calculate confusion matrix for a given set of classes. If GT value is outside of the [0, num_classes) it is excluded.

Parameters

ground_truth (np.ndarray) –
prediction (np.ndarray) –
num_classes (int) –

@TODO: Docs . Contribution is welcome

catalyst.contrib.utils.confusion_matrix.calculate_confusion_matrix_from_tensors(y_pred_logits: torch.Tensor, y_true: torch.Tensor) → numpy.ndarray[source]¶: @TODO: Docs. Contribution is welcome.

Dataset ¶

catalyst.contrib.utils.dataset.create_dataset(dirs: str, extension: str = None, process_fn: Callable[[str], object] = None, recursive: bool = False) → Dict[str, object][source]¶

Create dataset (dict like {key: [values]}) from vctk-like dataset:

dataset/
    cat/
        *.ext
    dog/
        *.ext

Parameters

dirs (str) – path to dirs, for example /home/user/data/**
extension (str) – data extension you are looking for
process_fn (Callable[[str], object]) – function(path_to_file) -> object process function for found files, by default
recursive (bool) – enables recursive globbing

Returns

dataset

Return type

dict

catalyst.contrib.utils.dataset.create_dataframe(dataset: Dict[str, object], **dataframe_args) → pandas.core.frame.DataFrame[source]¶

Create pd.DataFrame from dict like {key: [values]}.

Parameters

dataset – dict like {key: [values]}
**dataframe_args –

indexIndex or array-like
Index to use for resulting frame. Will default to np.arange(n) if no indexing information part of input data and no index provided

columnsIndex or array-like
Column labels to use for resulting frame. Will default to np.arange(n) if no column labels are provided

dtypedtype, default None
Data type to force, otherwise infer

Returns

dataframe from giving dataset

Return type

pd.DataFrame

catalyst.contrib.utils.dataset.split_dataset_train_test(dataset: pandas.core.frame.DataFrame, **train_test_split_args) → Tuple[Dict[str, object], Dict[str, object]][source]¶

Split dataset in train and test parts.

Parameters

dataset – dict like dataset
**train_test_split_args –

test_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is automatically set to the complement of the train size. If train size is also None, test size is set to 0.25.

train_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.

random_stateint or RandomState
Pseudo-random number generator state used for random sampling.

stratifyarray-like or None (default is None)
If not None, data is split in a stratified fashion, using this as the class labels.

Returns

train and test dicts

Misc ¶

catalyst.contrib.utils.misc.args_are_not_none(*args: Optional[Any]) → bool[source]¶

Check that all arguments are not None.

Parameters: *args (Any) – values
Returns: True if all value were not None, False otherwise
Return type: bool

catalyst.contrib.utils.misc.make_tuple(tuple_like)[source]¶

Creates a tuple if given tuple_like value isn’t list or tuple.

Returns: tuple or list

catalyst.contrib.utils.misc.pairwise(iterable: Iterable[Any]) → Iterable[Any][source]¶

Iterate sequences by pairs.

Examples

>>> for i in pairwise([1, 2, 5, -3]):
>>>     print(i)
(1, 2)
(2, 5)
(5, -3)

Parameters: iterable – Any iterable sequence
Returns: pairwise iterator

Pandas ¶

catalyst.contrib.utils.pandas.dataframe_to_list(dataframe: pandas.core.frame.DataFrame) → List[dict][source]¶

Converts dataframe to a list of rows (without indexes).

Parameters: dataframe (DataFrame) – input dataframe
Returns: list of rows
Return type: (List[dict])

catalyst.contrib.utils.pandas.folds_to_list(folds: Union[list, str, pandas.core.series.Series]) → List[int][source]¶

This function formats string or either list of numbers into a list of unique int.

Examples

>>> folds_to_list("1,2,1,3,4,2,4,6")
[1, 2, 3, 4, 6]
>>> folds_to_list([1, 2, 3.0, 5])
[1, 2, 3, 5]

Parameters: folds (Union[list, str, pd.Series]) – Either list of numbers or one string with numbers separated by commas or pandas series
Returns: list of unique ints
Return type: List[int]
Raises: ValueError – if value in string or array cannot be casted to int

catalyst.contrib.utils.pandas.split_dataframe(dataframe: pandas.core.frame.DataFrame, train_folds: List[int], valid_folds: Optional[List[int]] = None, infer_folds: Optional[List[int]] = None, tag2class: Optional[Dict[str, int]] = None, tag_column: str = None, class_column: str = None, seed: int = 42, n_folds: int = 5) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]¶

Split a Pandas DataFrame into folds.

Parameters

dataframe (pd.DataFrame) – input dataframe
train_folds (List[int]) – train folds
valid_folds (List[int], optional) – valid folds. If none takes all folds not included in train_folds
infer_folds (List[int], optional) – infer folds. If none takes all folds not included in train_folds and valid_folds
tag2class (Dict[str, int], optional) – mapping from label names into int
tag_column (str, optional) – column with label names
class_column (str, optional) – column to use for split
seed (int) – seed for split
n_folds (int) – number of folds

Returns

tuple with 4 dataframes: whole dataframe, train part, valid part and infer part

Return type

(tuple)

catalyst.contrib.utils.pandas.split_dataframe_on_column_folds(dataframe: pandas.core.frame.DataFrame, column: str, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]¶

Splits DataFrame into N folds.

Parameters

dataframe – a dataset
column – which column to use
random_state – seed for random shuffle
n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.split_dataframe_on_folds(dataframe: pandas.core.frame.DataFrame, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]¶

Splits DataFrame into N folds.

Parameters

dataframe – a dataset
random_state – seed for random shuffle
n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.split_dataframe_on_stratified_folds(dataframe: pandas.core.frame.DataFrame, class_column: str, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]¶

Splits DataFrame into N stratified folds.

Also see catalyst.data.sampler.BalanceClassSampler

Parameters

dataframe – a dataset
class_column – which column to use for split
random_state – seed for random shuffle
n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.split_dataframe_train_test(dataframe: pandas.core.frame.DataFrame, **train_test_split_args) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]¶

Split dataframe in train and test part.

Parameters

dataframe – pd.DataFrame to split
**train_test_split_args –

test_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is automatically set to the complement of the train size. If train size is also None, test size is set to 0.25.

train_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.

random_stateint or RandomState
Pseudo-random number generator state used for random sampling.

stratifyarray-like or None (default is None)
If not None, data is split in a stratified fashion, using this as the class labels.

Returns

train and test DataFrames

Note

It exist cause sklearn split is overcomplicated.

catalyst.contrib.utils.pandas.separate_tags(dataframe: pandas.core.frame.DataFrame, tag_column: str = 'tag', tag_delim: str = ', ') → pandas.core.frame.DataFrame[source]¶

Separates values in class_column column.

Parameters

dataframe – a dataset
tag_column – column name to separate values
tag_delim – delimiter to separate values

Returns

new dataframe

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.read_multiple_dataframes(in_csv_train: str = None, in_csv_valid: str = None, in_csv_infer: str = None, tag2class: Optional[Dict[str, int]] = None, class_column: str = None, tag_column: str = None) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]¶

This function reads train/valid/infer dataframes from giving paths.

Parameters

in_csv_train (str) – paths to train csv separated by commas
in_csv_valid (str) – paths to valid csv separated by commas
in_csv_infer (str) – paths to infer csv separated by commas
tag2class (Dict[str, int], optional) – mapping from label names into int
tag_column (str, optional) – column with label names
class_column (str, optional) – column to use for split

Returns

tuple with 4 dataframes: whole dataframe, train part, valid part and infer part

Return type

(tuple)

catalyst.contrib.utils.pandas.map_dataframe(dataframe: pandas.core.frame.DataFrame, tag_column: str, class_column: str, tag2class: Dict[str, int], verbose: bool = False) → pandas.core.frame.DataFrame[source]¶

This function maps tags from tag_column to ints into class_column using tag2class dictionary.

Parameters

dataframe (pd.DataFrame) – input dataframe
tag_column (str) – column with tags
class_column (str) –
tag2class (Dict[str, int]) – mapping from tags to class labels
verbose – flag if true, uses tqdm

Returns

updated dataframe with class_column

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.get_dataset_labeling(dataframe: pandas.core.frame.DataFrame, tag_column: str) → Dict[str, int][source]¶

Prepares a mapping using unique values from tag_column.

{
    "class_name_0": 0,
    "class_name_1": 1,
    ...
    "class_name_N": N
}

Parameters

dataframe – a dataset
tag_column – which column to use

Returns

mapping from tag to labels

Return type

Dict[str, int]

catalyst.contrib.utils.pandas.merge_multiple_fold_csv(fold_name: str, paths: Optional[str]) → pandas.core.frame.DataFrame[source]¶

Reads csv into one DataFrame with column fold.

Parameters

fold_name (str) – current fold name
paths (str) – paths to csv separated by commas

Returns

merged dataframes with column fold == fold_name

Return type

pd.DataFrame

catalyst.contrib.utils.pandas.read_csv_data(in_csv: str = None, train_folds: Optional[List[int]] = None, valid_folds: Optional[List[int]] = None, infer_folds: Optional[List[int]] = None, seed: int = 42, n_folds: int = 5, in_csv_train: str = None, in_csv_valid: str = None, in_csv_infer: str = None, tag2class: Optional[Dict[str, int]] = None, class_column: str = None, tag_column: str = None) → Tuple[pandas.core.frame.DataFrame, List[dict], List[dict], List[dict]][source]¶

From giving path in_csv reads a dataframe and split it to train/valid/infer folds or from several paths in_csv_train, in_csv_valid, in_csv_infer reads independent folds.

Note

This function can be used with different combinations of params.

First block is used to get dataset from one csv:: in_csv, train_folds, valid_folds, infer_folds, seed, n_folds
Second includes paths to different csv for train/valid and infer parts:: in_csv_train, in_csv_valid, in_csv_infer
The other params (tag2class, tag_column, class_column) are optional: for any previous block

Parameters

in_csv (str) – paths to whole dataset
train_folds (List[int]) – train folds
valid_folds (List[int], optional) – valid folds. If none takes all folds not included in train_folds
infer_folds (List[int], optional) – infer folds. If none takes all folds not included in train_folds and valid_folds
seed (int) – seed for split
n_folds (int) – number of folds
in_csv_train (str) – paths to train csv separated by commas
in_csv_valid (str) – paths to valid csv separated by commas
in_csv_infer (str) – paths to infer csv separated by commas
tag2class (Dict[str, int]) – mapping from label names into ints
tag_column (str) – column with label names
class_column (str) – column to use for split

Returns

tuple with 4 elements (whole dataframe, list with train data, list with valid data and list with infer data)

Return type

(Tuple[pd.DataFrame, List[dict], List[dict], List[dict]])

catalyst.contrib.utils.pandas.balance_classes(dataframe: pandas.core.frame.DataFrame, class_column: str = 'label', random_state: int = 42, how: str = 'downsampling') → pandas.core.frame.DataFrame[source]¶

Balance classes in dataframe by class_column.

Parameters

dataframe – a dataset
class_column – which column to use for split
random_state – seed for random shuffle
how – strategy to sample must be one on [“downsampling”, “upsampling”]

Returns

new dataframe with balanced class_column

Return type

pd.DataFrame

Parallel ¶

catalyst.contrib.utils.parallel.parallel_imap(func, args, pool: Union[multiprocessing.pool.Pool, catalyst.contrib.utils.parallel.DumbPool]) → List[T][source]¶: @TODO: Docs. Contribution is welcome.

catalyst.contrib.utils.parallel.tqdm_parallel_imap(func, args, pool: Union[multiprocessing.pool.Pool, catalyst.contrib.utils.parallel.DumbPool], total: int = None, pbar=<class 'tqdm.std.tqdm'>) → List[T][source]¶: @TODO: Docs. Contribution is welcome.

catalyst.contrib.utils.parallel.get_pool(workers: int) → Union[multiprocessing.pool.Pool, catalyst.contrib.utils.parallel.DumbPool][source]¶: @TODO: Docs. Contribution is welcome.

Plotly ¶

catalyst.contrib.utils.plotly.plot_tensorboard_log(logdir: Union[str, pathlib.Path], step: Optional[str] = 'batch', metrics: Optional[List[str]] = None, height: Optional[int] = None, width: Optional[int] = None) → None[source]¶: @TODO: Docs. Contribution is welcome.

catalyst.contrib.utils.plotly.plot_metrics(logdir: Union[str, pathlib.Path], step: Optional[str] = 'epoch', metrics: Optional[List[str]] = None, height: Optional[int] = None, width: Optional[int] = None) → None[source]¶

Plots your learning results.

Parameters

logdir – the logdir that was specified during training.
step – ‘batch’ or ‘epoch’ - what logs to show: for batches or for epochs
metrics – list of metrics to plot. The loss should be specified as ‘loss’, learning rate = ‘_base/lr’ and other metrics should be specified as names in metrics dict that was specified during training
height – the height of the whole resulting plot
width – the width of the whole resulting plot

Serialization ¶

catalyst.contrib.utils.serialization.serialize(data)¶

Serialize the data into bytes using pickle.

Parameters: data – a value
Returns: Returns a bytes object serialized with pickle data.

catalyst.contrib.utils.serialization.deserialize(data)¶

Deserialize bytes into an object using pickle.

Parameters: bytes – a bytes object containing serialized with pickle data.
Returns: Returns a value deserialized from the bytes-like object.

Text ¶

catalyst.contrib.utils.text.tokenize_text(text: str, tokenizer, max_length: int, strip: bool = True, lowercase: bool = True, remove_punctuation: bool = True) → Dict[str, numpy.array][source]¶

Tokenizes givin text.

Parameters

text (str) – text to tokenize
tokenizer – Tokenizer instance from HuggingFace
max_length (int) – maximum length of tokens
strip (bool) – if true strips text before tokenizing
lowercase (bool) – if true makes text lowercase before tokenizing
remove_punctuation (bool) – if true removes string.punctuation from text before tokenizing

catalyst.contrib.utils.text.process_bert_output(bert_output, hidden_size: int, output_hidden_states: bool = False, pooling_groups: List[str] = None, mask: torch.Tensor = None, level: Union[int, str] = None)[source]¶: Processed the output.

Visualization ¶

catalyst.contrib.utils.visualization.plot_confusion_matrix(cm, class_names=None, normalize=False, title='confusion matrix', fname=None, show=True, figsize=12, fontsize=32, colormap='Blues')[source]¶: Render the confusion matrix and return matplotlib”s figure with it. Normalization can be applied by setting normalize=True.

catalyst.contrib.utils.visualization.render_figure_to_tensor(figure)[source]¶: @TODO: Docs. Contribution is welcome.

Tools ¶

Tensorboard ¶

Tensorboard readers:

EventsFileReader
SummaryReader

exception catalyst.contrib.utils.tools.tensorboard.EventReadingException[source]¶

Bases: Exception

An exception that correspond to an event file reading error.

class catalyst.contrib.utils.tools.tensorboard.EventsFileReader(events_file: BinaryIO)[source]¶

Bases: collections.abc.Iterable

An iterator over a Tensorboard events file.

__init__(events_file: BinaryIO)[source]¶

Initialize an iterator over an events file.

Parameters: events_file – An opened file-like object.

class catalyst.contrib.utils.tools.tensorboard.SummaryItem(tag, step, wall_time, value, type)¶

Bases: tuple

property step¶: Alias for field number 1

property tag¶: Alias for field number 0

property type¶: Alias for field number 4

property value¶: Alias for field number 3

property wall_time¶: Alias for field number 2

class catalyst.contrib.utils.tools.tensorboard.SummaryReader(logdir: Union[str, pathlib.Path], tag_filter: Optional[collections.abc.Iterable] = None, types: collections.abc.Iterable = ('scalar',))[source]¶

Bases: collections.abc.Iterable

Iterates over events in all the files in the current logdir.

Note

Only scalars and images are supported at the moment.

__init__(logdir: Union[str, pathlib.Path], tag_filter: Optional[collections.abc.Iterable] = None, types: collections.abc.Iterable = ('scalar',))[source]¶

Initalize new summary reader.

Parameters

logdir – A directory with Tensorboard summary data
tag_filter – A list of tags to leave (None for all)
types – A list of types to get.
"scalar" and "image" types are allowed at the moment. (Only) –

Contrib¶

DL ¶

Callbacks ¶

NN ¶

Criterion ¶

Modules ¶

Optimizers ¶

Schedulers ¶

Models ¶

Segmentation ¶

Registry ¶

Utilities ¶

Argparse ¶

Compression ¶

Confusion Matrix ¶

Dataset ¶

Misc ¶

Pandas ¶

Parallel ¶

Plotly ¶

Serialization ¶

Text ¶

Visualization ¶

Tools ¶

Tensorboard ¶