Contrib¶
Datasets¶
MNIST¶
- class catalyst.contrib.datasets.mnist.MNIST(root, train=True, transform=None, target_transform=None, download=False)[source]¶
Bases:
torch.utils.data.dataset.Dataset
MNIST Dataset.
- __init__(root, train=True, transform=None, target_transform=None, download=False)[source]¶
- Parameters
root – Root directory of dataset where
MNIST/processed/training.pt
andMNIST/processed/test.pt
exist.train (bool, optional) – If True, creates dataset from
training.pt
, otherwise fromtest.pt
.download (bool, optional) – If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
transform (callable, optional) – A function/transform that takes in an image and returns a transformed version.
target_transform (callable, optional) – A function/transform that takes in the target and transforms it.
- Raises
RuntimeError – If
download is False
and the dataset not found.
MovieLens¶
- class catalyst.contrib.datasets.movielens.MovieLens(root, train=True, download=False, min_rating=0.0)[source]¶
Bases:
torch.utils.data.dataset.Dataset
MovieLens data sets were collected by the GroupLens Research Project at the University of Minnesota.
This data set consists of: * 100,000 ratings (1-5) from 943 users on 1682 movies. * Each user has rated at least 20 movies. * Simple demographic info for the users (age, gender, occupation, zip)
The data was collected through the MovieLens web site (movielens.umn.edu) during the seven-month period from September 19th, 1997 through April 22nd, 1998. This data has been cleaned up - users who had less than 20 ratings or did not have complete demographic information were removed from this data set. Detailed descriptions of the data file can be found at the end of this file.
Neither the University of Minnesota nor any of the researchers involved can guarantee the correctness of the data, its suitability for any particular purpose, or the validity of results based on the use of the data set. The data set may be used for any research purposes under the following conditions: * The user may not state or imply any endorsement from the University of Minnesota or the GroupLens Research Group. * The user must acknowledge the use of the data set in publications resulting from the use of the data set (see below for citation information). * The user may not redistribute the data without separate permission. * The user may not use this information for any commercial or revenue-bearing purposes without first obtaining permission from a faculty member of the GroupLens Research Project at the University of Minnesota.
If you have any further questions or comments, please contact GroupLens <grouplens-info@cs.umn.edu>. http://files.grouplens.org/datasets/movielens/ml-100k-README.txt
- __init__(root, train=True, download=False, min_rating=0.0)[source]¶
- Parameters
root (string) – Root directory of dataset where
MovieLens/processed/training.pt
andMovieLens/processed/test.pt
exist.train (bool, optional) – If True, creates dataset from
training.pt
, otherwise fromtest.pt
.download (bool, optional) – If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
min_rating (float, optional) – Minimum rating to include in the interaction matrix
- Raises
RuntimeError – If
download is False
and the dataset not found.
Computer Vision¶
Imagenette¶
- class catalyst.contrib.datasets.cv.imagenette.Imagenette(root: str, train: bool = True, download: bool = False, **kwargs)[source]¶
Bases:
catalyst.contrib.datasets.cv.misc.ImageClassificationDataset
Imagenette Dataset.
- __init__(root: str, train: bool = True, download: bool = False, **kwargs)¶
Constructor method for the
ImageClassificationDataset
class.- Parameters
root – root directory of dataset
train – if
True
, creates dataset fromtrain/
subfolder, otherwise fromval/
download – if
True
, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again**kwargs – Keyword-arguments passed to
super().__init__
method.
Imagenette160¶
- class catalyst.contrib.datasets.cv.imagenette.Imagenette160(root: str, train: bool = True, download: bool = False, **kwargs)[source]¶
Bases:
catalyst.contrib.datasets.cv.misc.ImageClassificationDataset
Imagenette Dataset with images resized so that the shortest size is 160 px.
- __init__(root: str, train: bool = True, download: bool = False, **kwargs)¶
Constructor method for the
ImageClassificationDataset
class.- Parameters
root – root directory of dataset
train – if
True
, creates dataset fromtrain/
subfolder, otherwise fromval/
download – if
True
, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again**kwargs – Keyword-arguments passed to
super().__init__
method.
Imagenette320¶
- class catalyst.contrib.datasets.cv.imagenette.Imagenette320(root: str, train: bool = True, download: bool = False, **kwargs)[source]¶
Bases:
catalyst.contrib.datasets.cv.misc.ImageClassificationDataset
Imagenette Dataset with images resized so that the shortest size is 320 px.
- __init__(root: str, train: bool = True, download: bool = False, **kwargs)¶
Constructor method for the
ImageClassificationDataset
class.- Parameters
root – root directory of dataset
train – if
True
, creates dataset fromtrain/
subfolder, otherwise fromval/
download – if
True
, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again**kwargs – Keyword-arguments passed to
super().__init__
method.
Imagewang¶
- class catalyst.contrib.datasets.cv.imagewang.Imagewang(root: str, train: bool = True, download: bool = False, **kwargs)[source]¶
Bases:
catalyst.contrib.datasets.cv.misc.ImageClassificationDataset
Imagewang Dataset.
- __init__(root: str, train: bool = True, download: bool = False, **kwargs)¶
Constructor method for the
ImageClassificationDataset
class.- Parameters
root – root directory of dataset
train – if
True
, creates dataset fromtrain/
subfolder, otherwise fromval/
download – if
True
, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again**kwargs – Keyword-arguments passed to
super().__init__
method.
Imagewang160¶
- class catalyst.contrib.datasets.cv.imagewang.Imagewang160(root: str, train: bool = True, download: bool = False, **kwargs)[source]¶
Bases:
catalyst.contrib.datasets.cv.misc.ImageClassificationDataset
Imagewang Dataset with images resized so that the shortest size is 160 px.
- __init__(root: str, train: bool = True, download: bool = False, **kwargs)¶
Constructor method for the
ImageClassificationDataset
class.- Parameters
root – root directory of dataset
train – if
True
, creates dataset fromtrain/
subfolder, otherwise fromval/
download – if
True
, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again**kwargs – Keyword-arguments passed to
super().__init__
method.
Imagewang320¶
- class catalyst.contrib.datasets.cv.imagewang.Imagewang320(root: str, train: bool = True, download: bool = False, **kwargs)[source]¶
Bases:
catalyst.contrib.datasets.cv.misc.ImageClassificationDataset
Imagewang Dataset with images resized so that the shortest size is 320 px.
- __init__(root: str, train: bool = True, download: bool = False, **kwargs)¶
Constructor method for the
ImageClassificationDataset
class.- Parameters
root – root directory of dataset
train – if
True
, creates dataset fromtrain/
subfolder, otherwise fromval/
download – if
True
, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again**kwargs – Keyword-arguments passed to
super().__init__
method.
Imagewoof¶
- catalyst.contrib.datasets.cv.imagewoof¶
alias of <module ‘catalyst.contrib.datasets.cv.imagewoof’ from ‘/home/runner/work/catalyst/catalyst/catalyst/contrib/datasets/cv/imagewoof.py’>
Imagewoof160¶
- class catalyst.contrib.datasets.cv.Imagewoof160(root: str, train: bool = True, download: bool = False, **kwargs)[source]¶
Bases:
catalyst.contrib.datasets.cv.misc.ImageClassificationDataset
Imagewoof Dataset with images resized so that the shortest size is 160 px.
- __init__(root: str, train: bool = True, download: bool = False, **kwargs)¶
Constructor method for the
ImageClassificationDataset
class.- Parameters
root – root directory of dataset
train – if
True
, creates dataset fromtrain/
subfolder, otherwise fromval/
download – if
True
, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again**kwargs – Keyword-arguments passed to
super().__init__
method.
Imagewoof320¶
- class catalyst.contrib.datasets.cv.Imagewoof320(root: str, train: bool = True, download: bool = False, **kwargs)[source]¶
Bases:
catalyst.contrib.datasets.cv.misc.ImageClassificationDataset
Imagewoof Dataset with images resized so that the shortest size is 320 px.
- __init__(root: str, train: bool = True, download: bool = False, **kwargs)¶
Constructor method for the
ImageClassificationDataset
class.- Parameters
root – root directory of dataset
train – if
True
, creates dataset fromtrain/
subfolder, otherwise fromval/
download – if
True
, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again**kwargs – Keyword-arguments passed to
super().__init__
method.
NN¶
Extensions for torch.nn
Criterion¶
CircleLoss¶
- class catalyst.contrib.nn.criterion.circle.CircleLoss(margin: float, gamma: float)[source]¶
Bases:
torch.nn.modules.module.Module
CircleLoss from Circle Loss: A Unified Perspective of Pair Similarity Optimization paper.
Adapter from: https://github.com/TinyZeaMays/CircleLoss
Example
>>> import torch >>> from torch.nn import functional as F >>> from catalyst.contrib.nn import CircleLoss >>> >>> features = F.normalize(torch.rand(256, 64, requires_grad=True)) >>> labels = torch.randint(high=10, size=(256)) >>> criterion = CircleLoss(margin=0.25, gamma=256) >>> criterion(features, labels)
DiceLoss¶
- class catalyst.contrib.nn.criterion.dice.DiceLoss(class_dim: int = 1, mode: str = 'macro', weights: Optional[List[float]] = None, eps: float = 1e-07)[source]¶
Bases:
torch.nn.modules.module.Module
The Dice loss. DiceLoss = 1 - dice score dice score = 2 * intersection / (intersection + union)) = = 2 * tp / (2 * tp + fp + fn)
- __init__(class_dim: int = 1, mode: str = 'macro', weights: Optional[List[float]] = None, eps: float = 1e-07)[source]¶
- Parameters
class_dim – indicates class dimention (K) for
outputs
andtargets
tensors (default = 1)mode – class summation strategy. Must be one of [‘micro’, ‘macro’, ‘weighted’]. If mode=’micro’, classes are ignored, and metric are calculated generally. If mode=’macro’, metric are calculated per-class and than are averaged over all classes. If mode=’weighted’, metric are calculated per-class and than summed over all classes with weights.
weights – class weights(for mode=”weighted”)
eps – epsilon to avoid zero division
FocalLossBinary¶
- class catalyst.contrib.nn.criterion.focal.FocalLossBinary(ignore: Optional[int] = None, reduced: bool = False, gamma: float = 2.0, alpha: float = 0.25, threshold: float = 0.5, reduction: str = 'mean')[source]¶
Bases:
torch.nn.modules.loss._Loss
Compute focal loss for binary classification problem.
It has been proposed in Focal Loss for Dense Object Detection paper.
FocalLossMultiClass¶
- class catalyst.contrib.nn.criterion.focal.FocalLossMultiClass(ignore: Optional[int] = None, reduced: bool = False, gamma: float = 2.0, alpha: float = 0.25, threshold: float = 0.5, reduction: str = 'mean')[source]¶
Bases:
catalyst.contrib.nn.criterion.focal.FocalLossBinary
Compute focal loss for multiclass problem. Ignores targets having -1 label.
It has been proposed in Focal Loss for Dense Object Detection paper.
- __init__(ignore: Optional[int] = None, reduced: bool = False, gamma: float = 2.0, alpha: float = 0.25, threshold: float = 0.5, reduction: str = 'mean')¶
@TODO: Docs. Contribution is welcome.
IoULoss¶
- class catalyst.contrib.nn.criterion.iou.IoULoss(class_dim: int = 1, mode: str = 'macro', weights: Optional[List[float]] = None, eps: float = 1e-07)[source]¶
Bases:
torch.nn.modules.module.Module
The intersection over union (Jaccard) loss. IOULoss = 1 - iou score iou score = intersection / union = tp / (tp + fp + fn)
- __init__(class_dim: int = 1, mode: str = 'macro', weights: Optional[List[float]] = None, eps: float = 1e-07)[source]¶
- Parameters
class_dim – indicates class dimention (K) for
outputs
andtargets
tensors (default = 1)mode – class summation strategy. Must be one of [‘micro’, ‘macro’, ‘weighted’]. If mode=’micro’, classes are ignored, and metric are calculated generally. If mode=’macro’, metric are calculated per-class and than are averaged over all classes. If mode=’weighted’, metric are calculated per-class and than summed over all classes with weights.
weights – class weights(for mode=”weighted”)
eps – epsilon to avoid zero division
MarginLoss¶
NTXentLoss¶
- class catalyst.contrib.nn.criterion.ntxent.NTXentLoss(tau: float, reduction: str = 'mean')[source]¶
Bases:
torch.nn.modules.module.Module
A Contrastive embedding loss.
It has been proposed in A Simple Framework for Contrastive Learning of Visual Representations.
Example:
import torch from torch.nn import functional as F from catalyst.contrib.nn import NTXentLoss embeddings_left = F.normalize(torch.rand(256, 64, requires_grad=True)) embeddings_right = F.normalize(torch.rand(256, 64, requires_grad=True)) criterion = NTXentLoss(tau = 0.1) criterion(embeddings_left, embeddings_right)
- __init__(tau: float, reduction: str = 'mean') None [source]¶
- Parameters
tau – temperature
reduction (string, optional) – specifies the reduction to apply to the output:
"none"
|"mean"
|"sum"
."none"
: no reduction will be applied,"mean"
: the sum of the output will be divided by the number of positive pairs in the output,"sum"
: the output will be summed.
- Raises
ValueError – if reduction is not mean, sum or none
SupervisedContrastiveLoss¶
- class catalyst.contrib.nn.criterion.supervised_contrastive.SupervisedContrastiveLoss(tau: float, reduction: str = 'mean', pos_aggregation='in')[source]¶
Bases:
torch.nn.modules.module.Module
A Contrastive embedding loss that uses targets.
It has been proposed in Supervised Contrastive Learning.
- __init__(tau: float, reduction: str = 'mean', pos_aggregation='in') None [source]¶
- Parameters
tau – temperature
reduction (string, optional) – specifies the reduction to apply to the output:
"none"
|"mean"
|"sum"
."none"
: no reduction will be applied,"mean"
: the sum of the output will be divided by the number of positive pairs in the output,"sum"
: the output will be summed.pos_aggregation (string, optional) – specifies the place of positive pairs aggregation:
"in"
|"out"
."in"
: maximization of log(average positive exponentiate similarity)"out"
: maximization of average positive similarity
- Raises
ValueError – if reduction is not mean, sum or none
ValueError – if positive aggregation is not in or out
TrevskyLoss¶
- class catalyst.contrib.nn.criterion.trevsky.TrevskyLoss(alpha: float, beta: Optional[float] = None, class_dim: int = 1, mode: str = 'macro', weights: Optional[List[float]] = None, eps: float = 1e-07)[source]¶
Bases:
torch.nn.modules.module.Module
The trevsky loss. TrevskyIndex = TP / (TP + alpha * FN + betta * FP) TrevskyLoss = 1 - TrevskyIndex
- __init__(alpha: float, beta: Optional[float] = None, class_dim: int = 1, mode: str = 'macro', weights: Optional[List[float]] = None, eps: float = 1e-07)[source]¶
- Parameters
alpha – false negative coefficient, bigger alpha bigger penalty for false negative. Must be in (0, 1)
beta – false positive coefficient, bigger alpha bigger penalty for false positive. Must be in (0, 1), if None beta = (1 - alpha)
class_dim – indicates class dimention (K) for
outputs
andtargets
tensors (default = 1)mode – class summation strategy. Must be one of [‘micro’, ‘macro’, ‘weighted’]. If mode=’micro’, classes are ignored, and metric are calculated generally. If mode=’macro’, metric are calculated separately and than are averaged over all classes. If mode=’weighted’, metric are calculated separately and than summed over all classes with weights.
weights – class weights(for mode=”weighted”)
eps – epsilon to avoid zero division
FocalTrevskyLoss¶
- class catalyst.contrib.nn.criterion.trevsky.FocalTrevskyLoss(alpha: float, beta: Optional[float] = None, gamma: float = 1.3333333333333333, class_dim: int = 1, mode: str = 'macro', weights: Optional[List[float]] = None, eps: float = 1e-07)[source]¶
Bases:
torch.nn.modules.module.Module
The focal trevsky loss. TrevskyIndex = TP / (TP + alpha * FN + betta * FP) FocalTrevskyLoss = (1 - TrevskyIndex)^gamma Node: focal will use per image, so loss will pay more attention on complicated images
- __init__(alpha: float, beta: Optional[float] = None, gamma: float = 1.3333333333333333, class_dim: int = 1, mode: str = 'macro', weights: Optional[List[float]] = None, eps: float = 1e-07)[source]¶
- Parameters
alpha – false negative coefficient, bigger alpha bigger penalty for false negative. Must be in (0, 1)
beta – false positive coefficient, bigger alpha bigger penalty for false positive. Must be in (0, 1), if None beta = (1 - alpha)
gamma – focal coefficient. It determines how much the weight of
reduced. (simple examples is) –
class_dim – indicates class dimention (K) for
outputs
andtargets
tensors (default = 1)mode – class summation strategy. Must be one of [‘micro’, ‘macro’, ‘weighted’]. If mode=’micro’, classes are ignored, and metric are calculated generally. If mode=’macro’, metric are calculated separately and than are averaged over all classes. If mode=’weighted’, metric are calculated separately and than summed over all classes with weights.
weights – class weights(for mode=”weighted”)
eps – epsilon to avoid zero division
TripletMarginLossWithSampler¶
WingLoss¶
- class catalyst.contrib.nn.criterion.wing.WingLoss(width: int = 5, curvature: float = 0.5, reduction: str = 'mean')[source]¶
Bases:
torch.nn.modules.module.Module
Creates a criterion that optimizes a Wing loss.
It has been proposed in Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks.
Adapted from: https://github.com/BloodAxe/pytorch-toolbelt
Contrastive¶
BarlowTwinsLoss¶
- class catalyst.contrib.nn.criterion.contrastive.BarlowTwinsLoss(offdiag_lambda=1.0, eps=1e-12)[source]¶
Bases:
torch.nn.modules.module.Module
The Contrastive embedding loss.
It has been proposed in Barlow Twins: Self-Supervised Learning via Redundancy Reduction.
Example:
import torch from torch.nn import functional as F from catalyst.contrib.nn import BarlowTwinsLoss embeddings_left = F.normalize(torch.rand(256, 64, requires_grad=True)) embeddings_right = F.normalize(torch.rand(256, 64, requires_grad=True)) criterion = BarlowTwinsLoss(offdiag_lambda = 1) criterion(embeddings_left, embeddings_right)
ContrastiveDistanceLoss¶
ContrastiveEmbeddingLoss¶
- class catalyst.contrib.nn.criterion.contrastive.ContrastiveEmbeddingLoss(margin=1.0, reduction='mean')[source]¶
Bases:
torch.nn.modules.module.Module
The Contrastive embedding loss.
It has been proposed in Dimensionality Reduction by Learning an Invariant Mapping.
ContrastivePairwiseEmbeddingLoss¶
RecSys¶
AdaptiveHingeLoss¶
- class catalyst.contrib.nn.criterion.recsys.AdaptiveHingeLoss[source]¶
Bases:
catalyst.contrib.nn.criterion.recsys.PairwiseLoss
Adaptive hinge loss function.
Takes a set of predictions for implicitly negative items, and selects those that are highest, thus sampling those negatives that are closes to violating the ranking implicit in the pattern of user interactions.
Example:
import torch from catalyst.contrib.nn.criterion import recsys pos_score = torch.randn(3, requires_grad=True) neg_scores = torch.randn(5, 3, requires_grad=True) output = recsys.AdaptiveHingeLoss()(pos_score, neg_scores) output.backward()
- forward(positive_score: torch.Tensor, negative_scores: torch.Tensor) torch.Tensor [source]¶
Forward propagation method for the adaptive hinge loss.
- Parameters
positive_score – Tensor containing predictions for known positive items.
negative_scores – Iterable of tensors containing predictions for sampled negative items. More tensors increase the likelihood of finding ranking-violating pairs, but risk overfitting.
- Returns
computed loss
- training: bool¶
BPRLoss¶
- class catalyst.contrib.nn.criterion.recsys.BPRLoss(gamma=1e-10)[source]¶
Bases:
catalyst.contrib.nn.criterion.recsys.PairwiseLoss
Bayesian Personalised Ranking loss function.
It has been proposed in BPRLoss: Bayesian Personalized Ranking from Implicit Feedback.
- Parameters
gamma (float) – Small value to avoid division by zero. Default:
1e-10
.
Example:
import torch from catalyst.contrib.nn.criterion import recsys pos_score = torch.randn(3, requires_grad=True) neg_score = torch.randn(3, requires_grad=True) output = recsys.BPRLoss()(pos_score, neg_score) output.backward()
- forward(positive_score: torch.Tensor, negative_score: torch.Tensor) torch.Tensor [source]¶
Forward propagation method for the BPR loss.
- Parameters
positive_score – Tensor containing predictions for known positive items.
negative_score – Tensor containing predictions for sampled negative items.
- Returns
computed loss
- training: bool¶
HingeLoss¶
- class catalyst.contrib.nn.criterion.recsys.HingeLoss[source]¶
Bases:
catalyst.contrib.nn.criterion.recsys.PairwiseLoss
Hinge loss function.
Example:
import torch from catalyst.contrib.nn.criterion import recsys pos_score = torch.randn(3, requires_grad=True) neg_score = torch.randn(3, requires_grad=True) output = recsys.HingeLoss()(pos_score, neg_score) output.backward()
- forward(positive_score: torch.Tensor, negative_score: torch.Tensor) torch.Tensor [source]¶
Forward propagation method for the hinge loss.
- Parameters
positive_score – Tensor containing predictions for known positive items.
negative_score – Tensor containing predictions for sampled negative items.
- Returns
computed loss
- training: bool¶
LogisticLoss¶
- class catalyst.contrib.nn.criterion.recsys.LogisticLoss[source]¶
Bases:
catalyst.contrib.nn.criterion.recsys.PairwiseLoss
Logistic loss function.
Example:
import torch from catalyst.contrib.nn.criterion import recsys pos_score = torch.randn(3, requires_grad=True) neg_score = torch.randn(3, requires_grad=True) output = recsys.LogisticLoss()(pos_score, neg_score) output.backward()
- forward(positive_score: torch.Tensor, negative_score: torch.Tensor) torch.Tensor [source]¶
Forward propagation method for the logistic loss.
- Parameters
positive_score – Tensor containing predictions for known positive items.
negative_score – Tensor containing predictions for sampled negative items.
- Returns
computed loss
- training: bool¶
RocStarLoss¶
- class catalyst.contrib.nn.criterion.recsys.RocStarLoss(delta: float = 1.0, sample_size: int = 100, sample_size_gamma: int = 1000, update_gamma_each: int = 50)[source]¶
Bases:
catalyst.contrib.nn.criterion.recsys.PairwiseLoss
Roc-star loss function.
Smooth approximation for ROC-AUC. It has been proposed in Roc-star: An objective function for ROC-AUC that actually works.
Adapted from: https://github.com/iridiumblue/roc-star/issues/2
- Parameters
delta – Param from the article. Default:
1.0
.sample_size – Number of examples to take for ROC AUC approximation. Default:
100
.sample_size_gamma – Number of examples to take for Gamma parameter approximation. Default:
1000
.update_gamma_each – Number of steps after which to recompute gamma value. Default:
50
.
Example
import torch from catalyst.contrib.nn.criterion import recsys outputs = torch.randn(5, 1, requires_grad=True) targets = torch.randn(5, 1, requires_grad=True) output = recsys.RocStarLoss()(outputs, targets) output.backward()
- forward(outputs: torch.Tensor, targets: torch.Tensor) torch.Tensor [source]¶
Forward propagation method for the roc-star loss.
- Parameters
outputs – Tensor of model predictions in [0, 1] range. Shape
(B x 1)
.targets – Tensor of true labels in {0, 1}. Shape
(B x 1)
.
- Returns
computed loss
- training: bool¶
WARPLoss¶
- class catalyst.contrib.nn.criterion.recsys.WARPLoss(max_num_trials: Optional[int] = None)[source]¶
Bases:
catalyst.contrib.nn.criterion.recsys.ListwiseLoss
Weighted Approximate-Rank Pairwise (WARP) loss function.
It has been proposed in WSABIE: Scaling Up To Large Vocabulary Image Annotation paper.
WARP loss randomly sample output labels of a model, until it finds a pair which it knows are wrongly labelled and will then only apply an update to these two incorrectly labelled examples.
Adapted from: https://github.com/gabrieltseng/datascience-projects/blob/master/misc/warp.py
- Parameters
max_num_trials – Number of attempts allowed to find a violating negative example. In practice it means that we optimize for ranks 1 to max_num_trials-1.
Example:
import torch from catalyst.contrib.nn.criterion import recsys outputs = torch.randn(5, 3, requires_grad=True) targets = torch.randn(5, 3, requires_grad=True) output = recsys.WARPLoss()(outputs, targets) output.backward()
- forward(outputs: torch.Tensor, targets: torch.Tensor) torch.Tensor [source]¶
Forward propagation method for the WARP loss.
- Parameters
outputs – Iterable of tensors containing predictions for all items.
targets – Iterable of tensors containing true labels for all items.
- Returns
computed loss
- training: bool¶
Regression¶
HuberLossV0¶
- class catalyst.contrib.nn.criterion.regression.HuberLossV0(clip_delta=1.0, reduction='mean')[source]¶
Bases:
torch.nn.modules.module.Module
@TODO: Docs. Contribution is welcome.
- forward(output: torch.Tensor, target: torch.Tensor, weights=None) torch.Tensor [source]¶
@TODO: Docs. Contribution is welcome.
- training: bool¶
CategoricalRegressionLoss¶
- class catalyst.contrib.nn.criterion.regression.CategoricalRegressionLoss(num_atoms: int, v_min: int, v_max: int)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(logits_t: torch.Tensor, logits_tp1: torch.Tensor, atoms_target_t: torch.Tensor) torch.Tensor [source]¶
Compute the loss
- Parameters
logits_t (torch.Tensor) – predicted atoms at step T, shape: [bs; num_atoms]
logits_tp1 (torch.Tensor) – predicted atoms at step T+1, shape: [bs; num_atoms]
atoms_target_t (torch.Tensor) – target atoms at step T, shape: [bs; num_atoms]
- Returns
computed loss
- Return type
torch.Tensor
- training: bool¶
QuantileRegressionLoss¶
- class catalyst.contrib.nn.criterion.regression.QuantileRegressionLoss(num_atoms: int = 51, clip_delta: float = 1.0)[source]¶
Bases:
torch.nn.modules.module.Module
- forward(outputs: torch.Tensor, targets: torch.Tensor) torch.Tensor [source]¶
Compute the loss.
- Parameters
outputs (torch.Tensor) – predicted atoms, shape: [bs; num_atoms]
targets (torch.Tensor) – target atoms, shape: [bs; num_atoms]
- Returns
computed loss
- Return type
torch.Tensor
- training: bool¶
RSquareLoss¶
- class catalyst.contrib.nn.criterion.regression.RSquareLoss[source]¶
Bases:
torch.nn.modules.module.Module
- forward(outputs: torch.Tensor, targets: torch.Tensor) torch.Tensor [source]¶
Compute the loss.
- Parameters
outputs (torch.Tensor) – model outputs
targets (torch.Tensor) – targets
- Returns
computed loss
- Return type
torch.Tensor
- training: bool¶
Modules¶
ArcFace and SubCenterArcFace¶
- class catalyst.contrib.nn.modules.arcface.ArcFace(in_features: int, out_features: int, s: float = 64.0, m: float = 0.5, eps: float = 1e-06)[source]¶
Bases:
torch.nn.modules.module.Module
Implementation of ArcFace: Additive Angular Margin Loss for Deep Face Recognition.
- Parameters
in_features – size of each input sample.
out_features – size of each output sample.
s – norm of input feature. Default:
64.0
.m – margin. Default:
0.5
.eps – operation accuracy. Default:
1e-6
.
- Shape:
Input: \((batch, H_{in})\) where \(H_{in} = in\_features\).
Output: \((batch, H_{out})\) where \(H_{out} = out\_features\).
Example
>>> layer = ArcFace(5, 10, s=1.31, m=0.5) >>> loss_fn = nn.CrossEntropyLoss() >>> embedding = torch.randn(3, 5, requires_grad=True) >>> target = torch.empty(3, dtype=torch.long).random_(10) >>> output = layer(embedding, target) >>> loss = loss_fn(output, target) >>> loss.backward()
- forward(input: torch.Tensor, target: Optional[torch.LongTensor] = None) torch.Tensor [source]¶
- Parameters
input – input features, expected shapes
BxF
whereB
is batch dimension andF
is an input feature dimension.target – target classes, expected shapes
B
whereB
is batch dimension. If None then will be returned projection on centroids. Default is None.
- Returns
tensor (logits) with shapes
BxC
whereC
is a number of classes (out_features).
- training: bool¶
- class catalyst.contrib.nn.modules.arcface.SubCenterArcFace(in_features: int, out_features: int, s: float = 64.0, m: float = 0.5, k: int = 3, eps: float = 1e-06)[source]¶
Bases:
torch.nn.modules.module.Module
Implementation of Sub-center ArcFace: Boosting Face Recognition by Large-scale Noisy Web Faces.
- Parameters
in_features – size of each input sample.
out_features – size of each output sample.
s – norm of input feature, Default:
64.0
.m – margin. Default:
0.5
.k – number of possible class centroids. Default:
3
.eps (float, optional) – operation accuracy. Default:
1e-6
.
- Shape:
Input: \((batch, H_{in})\) where \(H_{in} = in\_features\).
Output: \((batch, H_{out})\) where \(H_{out} = out\_features\).
Example
>>> layer = SubCenterArcFace(5, 10, s=1.31, m=0.35, k=2) >>> loss_fn = nn.CrosEntropyLoss() >>> embedding = torch.randn(3, 5, requires_grad=True) >>> target = torch.empty(3, dtype=torch.long).random_(10) >>> output = layer(embedding, target) >>> loss = loss_fn(output, target) >>> loss.backward()
- forward(input: torch.Tensor, target: Optional[torch.LongTensor] = None) torch.Tensor [source]¶
- Parameters
input – input features, expected shapes
BxF
whereB
is batch dimension andF
is an input feature dimension.target – target classes, expected shapes
B
whereB
is batch dimension. If None then will be returned projection on centroids. Default is None.
- Returns
tensor (logits) with shapes
BxC
whereC
is a number of classes.
- training: bool¶
Arc Margin Product¶
- class catalyst.contrib.nn.modules.arcmargin.ArcMarginProduct(in_features: int, out_features: int)[source]¶
Bases:
torch.nn.modules.module.Module
Implementation of Arc Margin Product.
- Parameters
in_features – size of each input sample.
out_features – size of each output sample.
- Shape:
Input: \((batch, H_{in})\) where \(H_{in} = in\_features\).
Output: \((batch, H_{out})\) where \(H_{out} = out\_features\).
Example
>>> layer = ArcMarginProduct(5, 10) >>> loss_fn = nn.CrosEntropyLoss() >>> embedding = torch.randn(3, 5, requires_grad=True) >>> target = torch.empty(3, dtype=torch.long).random_(10) >>> output = layer(embedding) >>> loss = loss_fn(output, target) >>> loss.backward()
- forward(input: torch.Tensor) torch.Tensor [source]¶
- Parameters
input – input features, expected shapes
BxF
whereB
is batch dimension andF
is an input feature dimension.- Returns
tensor (logits) with shapes
BxC
whereC
is a number of classes (out_features).
- training: bool¶
CosFace and AdaCos¶
- class catalyst.contrib.nn.modules.cosface.AdaCos(in_features: int, out_features: int, dynamical_s: bool = True, eps: float = 1e-06)[source]¶
Bases:
torch.nn.modules.module.Module
Implementation of AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations.
- Parameters
in_features – size of each input sample.
out_features – size of each output sample.
dynamical_s – option to use dynamical scale parameter. If
False
then will be used initial scale. Default:True
.eps – operation accuracy. Default:
1e-6
.
- Shape:
Input: \((batch, H_{in})\) where \(H_{in} = in\_features\).
Output: \((batch, H_{out})\) where \(H_{out} = out\_features\).
Example
>>> layer = AdaCos(5, 10) >>> loss_fn = nn.CrosEntropyLoss() >>> embedding = torch.randn(3, 5, requires_grad=True) >>> target = torch.empty(3, dtype=torch.long).random_(10) >>> output = layer(embedding, target) >>> loss = loss_fn(output, target) >>> loss.backward()
- forward(input: torch.Tensor, target: Optional[torch.LongTensor] = None) torch.Tensor [source]¶
- Parameters
input – input features, expected shapes
BxF
whereB
is batch dimension andF
is an input feature dimension.target – target classes, expected shapes
B
whereB
is batch dimension. If None then will be returned projection on centroids. Default is None.
- Returns
tensor (logits) with shapes
BxC
whereC
is a number of classes (out_features).
- training: bool¶
- class catalyst.contrib.nn.modules.cosface.CosFace(in_features: int, out_features: int, s: float = 64.0, m: float = 0.35)[source]¶
Bases:
torch.nn.modules.module.Module
Implementation of CosFace: Large Margin Cosine Loss for Deep Face Recognition.
- Parameters
in_features – size of each input sample.
out_features – size of each output sample.
s – norm of input feature. Default:
64.0
.m – margin. Default:
0.35
.
- Shape:
Input: \((batch, H_{in})\) where \(H_{in} = in\_features\).
Output: \((batch, H_{out})\) where \(H_{out} = out\_features\).
Example
>>> layer = CosFaceLoss(5, 10, s=1.31, m=0.1) >>> loss_fn = nn.CrosEntropyLoss() >>> embedding = torch.randn(3, 5, requires_grad=True) >>> target = torch.empty(3, dtype=torch.long).random_(10) >>> output = layer(embedding, target) >>> loss = loss_fn(output, target) >>> loss.backward()
- forward(input: torch.Tensor, target: Optional[torch.LongTensor] = None) torch.Tensor [source]¶
- Parameters
input – input features, expected shapes
BxF
whereB
is batch dimension andF
is an input feature dimension.target – target classes, expected shapes
B
whereB
is batch dimension. If None then will be returned projection on centroids. Default is None.
- Returns
tensor (logits) with shapes
BxC
whereC
is a number of classes (out_features).
- training: bool¶
CurricularFace¶
- class catalyst.contrib.nn.modules.curricularface.CurricularFace(in_features: int, out_features: int, s: float = 64.0, m: float = 0.5)[source]¶
Bases:
torch.nn.modules.module.Module
Implementation of CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition.
Official pytorch implementation.
- Parameters
in_features – size of each input sample.
out_features – size of each output sample.
s – norm of input feature. Default:
64.0
.m – margin. Default:
0.5
.
- Shape:
Input: \((batch, H_{in})\) where \(H_{in} = in\_features\).
Output: \((batch, H_{out})\) where \(H_{out} = out\_features\).
Example
>>> layer = CurricularFace(5, 10, s=1.31, m=0.5) >>> loss_fn = nn.CrosEntropyLoss() >>> embedding = torch.randn(3, 5, requires_grad=True) >>> target = torch.empty(3, dtype=torch.long).random_(10) >>> output = layer(embedding, target) >>> loss = loss_fn(output, target) >>> loss.backward()
- forward(input: torch.Tensor, label: Optional[torch.LongTensor] = None) torch.Tensor [source]¶
- Parameters
input – input features, expected shapes
BxF
whereB
is batch dimension andF
is an input feature dimension.label – target classes, expected shapes
B
whereB
is batch dimension. If None then will be returned projection on centroids. Default is None.
- Returns
tensor (logits) with shapes
BxC
whereC
is a number of classes.
- training: bool¶
sSE¶
- class catalyst.contrib.nn.modules.se.sSE(in_channels: int)[source]¶
Bases:
torch.nn.modules.module.Module
The sSE (Channel Squeeze and Spatial Excitation) block from the Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks paper.
Adapted from https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/66178
Shape:
Input: (batch, channels, height, width)
Output: (batch, channels, height, width) (same shape as input)
cSE¶
- class catalyst.contrib.nn.modules.se.cSE(in_channels: int, r: int = 16)[source]¶
Bases:
torch.nn.modules.module.Module
The channel-wise SE (Squeeze and Excitation) block from the Squeeze-and-Excitation Networks paper.
Adapted from https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/65939 and https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/66178
Shape:
Input: (batch, channels, height, width)
Output: (batch, channels, height, width) (same shape as input)
scSE¶
- class catalyst.contrib.nn.modules.se.scSE(in_channels: int, r: int = 16)[source]¶
Bases:
torch.nn.modules.module.Module
The scSE (Concurrent Spatial and Channel Squeeze and Channel Excitation) block from the Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks paper.
Adapted from https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/66178
Shape:
Input: (batch, channels, height, width)
Output: (batch, channels, height, width) (same shape as input)
SoftMax¶
- class catalyst.contrib.nn.modules.softmax.SoftMax(in_features: int, num_classes: int)[source]¶
Bases:
torch.nn.modules.module.Module
Implementation of Significance of Softmax-based Features in Comparison to Distance Metric Learning-based Features.
- Parameters
in_features – size of each input sample.
out_features – size of each output sample.
- Shape:
Input: \((batch, H_{in})\) where \(H_{in} = in\_features\).
Output: \((batch, H_{out})\) where \(H_{out} = out\_features\).
Example
>>> layer = SoftMax(5, 10) >>> loss_fn = nn.CrosEntropyLoss() >>> embedding = torch.randn(3, 5, requires_grad=True) >>> target = torch.empty(3, dtype=torch.long).random_(10) >>> output = layer(embedding, target) >>> loss = loss_fn(output, target) >>> loss.backward()
- forward(input: torch.Tensor) torch.Tensor [source]¶
- Parameters
input – input features, expected shapes
BxF
whereB
is batch dimension andF
is an input feature dimension.- Returns
tensor (logits) with shapes
BxC
whereC
is a number of classes (out_features).
- training: bool¶
Optimizers¶
AdamP¶
- class catalyst.contrib.nn.optimizers.adamp.AdamP(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, delta=0.1, wd_ratio=0.1, nesterov=False)[source]¶
Bases:
torch.optim.optimizer.Optimizer
Implements AdamP algorithm.
The original Adam algorithm was proposed in Adam: A Method for Stochastic Optimization. The AdamP variant was proposed in Slowing Down the Weight Norm Increase in Momentum-based Optimizers.
- Parameters
params – iterable of parameters to optimize or dicts defining parameter groups
lr (float, optional) – learning rate (default: 1e-3)
betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
weight_decay (float, optional) – weight decay coefficient (default: 0)
delta – threshold that determines whether a set of parameters is scale invariant or not (default: 0.1)
wd_ratio – relative weight decay applied on scale-invariant parameters compared to that applied on scale-variant parameters (default: 0.1)
nesterov (boolean, optional) – enables Nesterov momentum (default: False)
Original source code: https://github.com/clovaai/AdamP
- __init__(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, delta=0.1, wd_ratio=0.1, nesterov=False)[source]¶
- Parameters
params – iterable of parameters to optimize or dicts defining parameter groups
lr (float, optional) – learning rate (default: 1e-3)
betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
weight_decay (float, optional) – weight decay coefficient (default: 1e-2)
delta – threshold that determines whether a set of parameters is scale invariant or not (default: 0.1)
wd_ratio – relative weight decay applied on scale-invariant parameters compared to that applied on scale-variant parameters (default: 0.1)
nesterov (boolean, optional) – enables Nesterov momentum (default: False)
Lamb¶
- class catalyst.contrib.nn.optimizers.lamb.Lamb(params, lr: Optional[float] = 0.001, betas: Optional[Tuple[float, float]] = (0.9, 0.999), eps: Optional[float] = 1e-06, weight_decay: Optional[float] = 0.0, adam: Optional[bool] = False)[source]¶
Bases:
torch.optim.optimizer.Optimizer
Implements Lamb algorithm.
It has been proposed in Training BERT in 76 minutes.
- __init__(params, lr: Optional[float] = 0.001, betas: Optional[Tuple[float, float]] = (0.9, 0.999), eps: Optional[float] = 1e-06, weight_decay: Optional[float] = 0.0, adam: Optional[bool] = False)[source]¶
- Parameters
params – iterable of parameters to optimize or dicts defining parameter groups
lr (float, optional) – learning rate (default: 1e-3)
betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)
adam (bool, optional) – always use trust ratio = 1, which turns this into Adam. Useful for comparison purposes.
- Raises
ValueError – if invalid learning rate, epsilon value or betas.
Lookahead¶
- class catalyst.contrib.nn.optimizers.lookahead.Lookahead(optimizer: torch.optim.optimizer.Optimizer, k: int = 5, alpha: float = 0.5)[source]¶
Bases:
torch.optim.optimizer.Optimizer
Implements Lookahead algorithm.
It has been proposed in Lookahead Optimizer: k steps forward, 1 step back.
Adapted from: https://github.com/alphadl/lookahead.pytorch (MIT License)
QHAdamW¶
- class catalyst.contrib.nn.optimizers.qhadamw.QHAdamW(params, lr=0.001, betas=(0.995, 0.999), nus=(0.7, 1.0), weight_decay=0.0, eps=1e-08)[source]¶
Bases:
torch.optim.optimizer.Optimizer
Implements QHAdam algorithm.
Combines QHAdam algorithm that was proposed in Quasi-hyperbolic momentum and Adam for deep learning with weight decay decoupling from Decoupled Weight Decay Regularization paper.
Example
>>> optimizer = QHAdamW( ... model.parameters(), ... lr=3e-4, nus=(0.8, 1.0), betas=(0.99, 0.999)) >>> optimizer.zero_grad() >>> loss_fn(model(input), target).backward() >>> optimizer.step()
Adapted from: https://github.com/iprally/qhadamw-pytorch/blob/master/qhadamw.py (MIT License)
- __init__(params, lr=0.001, betas=(0.995, 0.999), nus=(0.7, 1.0), weight_decay=0.0, eps=1e-08)[source]¶
- Parameters
params (iterable) – iterable of parameters to optimize or dicts defining parameter groups
lr (float, optional) – learning rate (\(\alpha\) from the paper) (default: 1e-3)
betas (Tuple[float, float], optional) – coefficients used for computing running averages of the gradient and its square (default: (0.995, 0.999))
nus (Tuple[float, float], optional) – immediate discount factors used to estimate the gradient and its square (default: (0.7, 1.0))
eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
weight_decay (float, optional) – weight decay (L2 regularization coefficient, times two) (default: 0.0)
- Raises
ValueError – if invalid learning rate, epsilon value, betas or weight_decay value.
RAdam¶
- class catalyst.contrib.nn.optimizers.radam.RAdam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)[source]¶
Bases:
torch.optim.optimizer.Optimizer
Implements RAdam algorithm.
It has been proposed in On the Variance of the Adaptive Learning Rate and Beyond.
@TODO: Docs (add Example). Contribution is welcome
Adapted from: https://github.com/LiyuanLucasLiu/RAdam (Apache-2.0 License)
Ralamb¶
- class catalyst.contrib.nn.optimizers.ralamb.Ralamb(params: Iterable, lr: float = 0.001, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0)[source]¶
Bases:
torch.optim.optimizer.Optimizer
RAdam optimizer with LARS/LAMB tricks.
Adapted from: https://github.com/mgrankin/over9000/blob/master/ralamb.py (Apache-2.0 License)
- __init__(params: Iterable, lr: float = 0.001, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0)[source]¶
- Parameters
params – iterable of parameters to optimize or dicts defining parameter groups
lr (float, optional) – learning rate (default: 1e-3)
betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)
SGDP¶
- class catalyst.contrib.nn.optimizers.sgdp.SGDP(params, lr=<required parameter>, momentum=0, weight_decay=0, dampening=0, nesterov=False, eps=1e-08, delta=0.1, wd_ratio=0.1)[source]¶
Bases:
torch.optim.optimizer.Optimizer
Implements SGDP algorithm.
The SGDP variant was proposed in Slowing Down the Weight Norm Increase in Momentum-based Optimizers.
- Parameters
params – iterable of parameters to optimize or dicts defining parameter groups
lr – learning rate
momentum (float, optional) – momentum factor (default: 0)
weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)
dampening (float, optional) – dampening for momentum (default: 0)
nesterov (bool, optional) – enables Nesterov momentum (default: False)
eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
delta – threshold that determines whether a set of parameters is scale invariant or not (default: 0.1)
wd_ratio – relative weight decay applied on scale-invariant parameters compared to that applied on scale-variant parameters (default: 0.1)
- __init__(params, lr=<required parameter>, momentum=0, weight_decay=0, dampening=0, nesterov=False, eps=1e-08, delta=0.1, wd_ratio=0.1)[source]¶
- Parameters
params – iterable of parameters to optimize or dicts defining parameter groups
lr – learning rate
momentum (float, optional) – momentum factor (default: 0)
weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)
dampening (float, optional) – dampening for momentum (default: 0)
nesterov (bool, optional) – enables Nesterov momentum (default: False)
eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8)
delta – threshold that determines whether a set of parameters is scale invariant or not (default: 0.1)
wd_ratio – relative weight decay applied on scale-invariant parameters compared to that applied on scale-variant parameters (default: 0.1)
Schedulers¶
OneCycleLRWithWarmup¶
- class catalyst.contrib.nn.schedulers.onecycle.OneCycleLRWithWarmup(optimizer: torch.optim.optimizer.Optimizer, num_steps: int, lr_range=(1.0, 0.005), init_lr: Optional[float] = None, warmup_steps: int = 0, warmup_fraction: Optional[float] = None, decay_steps: int = 0, decay_fraction: Optional[float] = None, momentum_range=(0.8, 0.99, 0.999), init_momentum: Optional[float] = None)[source]¶
Bases:
catalyst.contrib.nn.schedulers.base.BatchScheduler
OneCycle scheduler with warm-up & lr decay stages.
First stage increases lr from
init_lr
tomax_lr
, and calledwarmup
. Also it decreases momentum frominit_momentum
tomin_momentum
. Takeswarmup_steps
stepsSecond is
annealing
stage. Decrease lr frommax_lr
tomin_lr
, Increase momentum frommin_momentum
tomax_momentum
.Third, optional, lr decay.
- __init__(optimizer: torch.optim.optimizer.Optimizer, num_steps: int, lr_range=(1.0, 0.005), init_lr: Optional[float] = None, warmup_steps: int = 0, warmup_fraction: Optional[float] = None, decay_steps: int = 0, decay_fraction: Optional[float] = None, momentum_range=(0.8, 0.99, 0.999), init_momentum: Optional[float] = None)[source]¶
- Parameters
optimizer – PyTorch optimizer
num_steps – total number of steps
lr_range – tuple with two or three elements (max_lr, min_lr, [final_lr])
init_lr (float, optional) – initial lr
warmup_steps – count of steps for warm-up stage
warmup_fraction (float, optional) – fraction in [0; 1) to calculate number of warmup steps. Cannot be set together with
warmup_steps
decay_steps – count of steps for lr decay stage
decay_fraction (float, optional) – fraction in [0; 1) to calculate number of decay steps. Cannot be set together with
decay_steps
momentum_range – tuple with two or three elements (min_momentum, max_momentum, [final_momentum])
init_momentum (float, optional) – initial momentum
Scripts¶
You can use contrib scripts with catalyst-contrib in your terminal. For example:
$ catalyst-contrib tag2label --help
Catalyst-contrib scripts.
Examples
1. collect-env outputs relevant system environment info. Diagnose your system and show basic information. Used to get detail info for better bug reporting.
$ catalyst-contrib collect-env
2. process-images reads raw data and outputs preprocessed resized images
$ catalyst-contrib process-images \\
--in-dir /path/to/raw/data/ \\
--out-dir=./data/dataset \\
--num-workers=6 \\
--max-size=224 \\
--extension=png \\
--clear-exif \\
--grayscale \\
--expand-dims
3. tag2label prepares a dataset to json like {“class_id”: class_column_from_dataset}
$ catalyst-contrib tag2label \\
--in-dir=./data/dataset \\
--out-dataset=./data/dataset_raw.csv \\
--out-labeling=./data/tag2cls.json
split-dataframe split your dataset into train/valid folds
$ catalyst-contrib split-dataframe \\
--in-csv=./data/dataset_raw.csv \\
--tag2class=./data/tag2cls.json \\
--tag-column=tag \\
--class-column=class \\
--n-folds=5 \\
--train-folds=0,1,2,3 \\
--out-csv=./data/dataset.csv