Shortcuts

Metrics

Accuracy

catalyst.metrics.accuracy.accuracy(outputs: torch.Tensor, targets: torch.Tensor, topk: Sequence[int] = (1, ), activation: Optional[str] = None) → Sequence[torch.Tensor][source]

Computes multiclass accuracy@topk for the specified values of topk.

Parameters
  • outputs – model outputs, logits with shape [bs; num_classes]

  • targets – ground truth, labels with shape [bs; 1]

  • activation – activation to use for model output

  • topktopk for accuracy@topk computing

Returns

list with computed accuracy@topk

Example

>>> accuracy(
>>>     outputs=torch.tensor([
>>>         [1, 0, 0],
>>>         [0, 1, 0],
>>>         [0, 0, 1],
>>>     ]),
>>>     targets=torch.tensor([0, 1, 2]),
>>> )
[tensor([1.])]
>>> accuracy(
>>>     outputs=torch.tensor([
>>>         [1, 0, 0],
>>>         [0, 1, 0],
>>>         [0, 1, 0],
>>>     ]),
>>>     targets=torch.tensor([0, 1, 2]),
>>> )
[tensor([0.6667])]
>>> accuracy(
>>>     outputs=torch.tensor([
>>>         [1, 0, 0],
>>>         [0, 1, 0],
>>>         [0, 0, 1],
>>>     ]),
>>>     targets=torch.tensor([0, 1, 2]),
>>>     topk=[1, 3],
>>> )
[tensor([1.]), tensor([1.])]
>>> accuracy(
>>>     outputs=torch.tensor([
>>>         [1, 0, 0],
>>>         [0, 1, 0],
>>>         [0, 1, 0],
>>>     ]),
>>>     targets=torch.tensor([0, 1, 2]),
>>>     topk=[1, 3],
>>> )
[tensor([0.6667]), tensor([1.])]
catalyst.metrics.accuracy.multilabel_accuracy(outputs: torch.Tensor, targets: torch.Tensor, threshold: Union[float, torch.Tensor]) → torch.Tensor[source]

Computes multilabel accuracy for the specified activation and threshold.

Parameters
  • outputs – NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.

  • targets – binary NxK tensort that encodes which of the K classes are associated with the N-th input (eg: a row [0, 1, 0, 1] indicates that the example is associated with classes 2 and 4)

  • threshold – threshold for for model output

Returns

computed multilabel accuracy

Example

>>> multilabel_accuracy(
>>>     outputs=torch.tensor([
>>>         [1, 0],
>>>         [0, 1],
>>>     ]),
>>>     targets=torch.tensor([
>>>         [1, 0],
>>>         [0, 1],
>>>     ]),
>>>     threshold=0.5,
>>> )
tensor([1.])
>>> multilabel_accuracy(
>>>     outputs=torch.tensor([
>>>         [1.0, 0.0],
>>>         [0.6, 1.0],
>>>     ]),
>>>     targets=torch.tensor([
>>>         [1, 0],
>>>         [0, 1],
>>>     ]),
>>>     threshold=0.5,
>>> )
tensor(0.7500)
>>> multilabel_accuracy(
>>>     outputs=torch.tensor([
>>>         [1.0, 0.0],
>>>         [0.4, 1.0],
>>>     ]),
>>>     targets=torch.tensor([
>>>         [1, 0],
>>>         [0, 1],
>>>     ]),
>>>     threshold=0.5,
>>> )
tensor(1.0)

AUC

catalyst.metrics.auc.auc(outputs: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]

AUC metric.

Parameters
  • outputs – [data_len; num_classes] estimated scores from a model.

  • targets – [data_len; num_classes] ground truth (correct) target values.

Returns

Tensor with [num_classes] shape of per-class-aucs

Return type

torch.Tensor

Example

>>> auc(
>>>     outputs=torch.tensor([
>>>         [0.9, 0.1],
>>>         [0.1, 0.9],
>>>     ]),
>>>     targets=torch.tensor([
>>>         [1, 0],
>>>         [0, 1],
>>>     ]),
>>> )
tensor([1., 1.])
>>> auc(
>>>     outputs=torch.tensor([
>>>         [0.9],
>>>         [0.8],
>>>         [0.7],
>>>         [0.6],
>>>         [0.5],
>>>         [0.4],
>>>         [0.3],
>>>         [0.2],
>>>         [0.1],
>>>         [0.0],
>>>     ]),
>>>     targets=torch.tensor([
>>>         [0],
>>>         [1],
>>>         [1],
>>>         [1],
>>>         [1],
>>>         [1],
>>>         [1],
>>>         [0],
>>>         [0],
>>>         [0],
>>>     ]),
>>> )
tensor([0.7500])

CMC score

catalyst.metrics.cmc_score.cmc_score_count(distances: torch.Tensor, conformity_matrix: torch.Tensor, topk: int = 1) → float[source]

Function to count CMC from distance matrix and conformity matrix.

Parameters
  • distances – distance matrix shape of (n_embeddings_x, n_embeddings_y)

  • conformity_matrix – binary matrix with 1 on same label pos and 0 otherwise

  • topk – number of top examples for cumulative score counting

Returns

cmc score

catalyst.metrics.cmc_score.cmc_score(query_embeddings: torch.Tensor, gallery_embeddings: torch.Tensor, conformity_matrix: torch.Tensor, topk: int = 1) → float[source]

Function to count CMC score from query and gallery embeddings.

Parameters
  • query_embeddings – tensor shape of (n_embeddings, embedding_dim) embeddings of the objects in querry

  • gallery_embeddings – tensor shape of (n_embeddings, embedding_dim) embeddings of the objects in gallery

  • conformity_matrix – binary matrix with 1 on same label pos and 0 otherwise

  • topk – number of top examples for cumulative score counting

Returns

cmc score

Dice

catalyst.metrics.dice.dice(outputs: torch.Tensor, targets: torch.Tensor, class_dim: int = 1, threshold: float = None, eps: float = 1e-07) → torch.Tensor[source]

Computes the dice score.

Parameters
  • outputs – [N; K; …] tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.

  • targets – binary [N; K; …] tensort that encodes which of the K classes are associated with the N-th input

  • class_dim – indicates class dimention (K) for outputs and targets tensors (default = 1)

  • threshold – threshold for outputs binarization

  • eps – epsilon to avoid zero division

Returns

Dice score

Examples

>>> size = 4
>>> half_size = size // 2
>>> shape = (1, 1, size, size)
>>> empty = torch.zeros(shape)
>>> full = torch.ones(shape)
>>> left = torch.ones(shape)
>>> left[:, :, :, half_size:] = 0
>>> right = torch.ones(shape)
>>> right[:, :, :, :half_size] = 0
>>> top_left = torch.zeros(shape)
>>> top_left[:, :, :half_size, :half_size] = 1
>>> pred = torch.cat([empty, left, empty, full, left, top_left], dim=1)
>>> targets = torch.cat([full, right, empty, full, left, left], dim=1)
>>> dice(
>>>     outputs=pred,
>>>     targets=targets,
>>>     class_dim=1,
>>>     threshold=0.5,
>>> )
tensor([0.0000, 0.0000, 1.0000, 1.0000, 1.0000, 0.66666])
catalyst.metrics.dice.calculate_dice(true_positives: numpy.array, false_positives: numpy.array, false_negatives: numpy.array) → numpy.array[source]

Calculate list of Dice coefficients.

Parameters
  • true_positives – true positives numpy tensor

  • false_positives – false positives numpy tensor

  • false_negatives – false negatives numpy tensor

Returns

dice score

Return type

np.array

Raises

ValueError – if dice is out of [0; 1] bounds

F1 score

catalyst.metrics.f1_score.f1_score(outputs: torch.Tensor, targets: torch.Tensor, eps: float = 1e-07, argmax_dim: int = -1, num_classes: Optional[int] = None) → Union[float, torch.Tensor][source]

Fbeta_score with beta=1.

Parameters
  • outputs – A list of predicted elements

  • targets – A list of elements that are to be predicted

  • eps – epsilon to avoid zero division

  • argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in outputs

  • num_classes – int, that specifies number of classes if it known

Returns

F_1 score

Return type

float

catalyst.metrics.f1_score.fbeta_score(outputs: torch.Tensor, targets: torch.Tensor, beta: float = 1.0, eps: float = 1e-07, argmax_dim: int = -1, num_classes: Optional[int] = None) → Union[float, torch.Tensor][source]

Counts fbeta score for given outputs and targets.

Parameters
  • outputs – A list of predicted elements

  • targets – A list of elements that are to be predicted

  • beta – beta param for f_score

  • eps – epsilon to avoid zero division

  • argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in outputs

  • num_classes – int, that specifies number of classes if it known

Raises

Exception – If beta is a negative number.

Returns

F_1 score.

Return type

float

Focal

Focal losses:
catalyst.metrics.focal.sigmoid_focal_loss(outputs: torch.Tensor, targets: torch.Tensor, gamma: float = 2.0, alpha: float = 0.25, reduction: str = 'mean')[source]

Compute binary focal loss between target and output logits.

Parameters
  • outputs – tensor of arbitrary shape

  • targets – tensor of the same shape as input

  • gamma – gamma for focal loss

  • alpha – alpha for focal loss

  • reduction (string, optional) – specifies the reduction to apply to the output: "none" | "mean" | "sum" | "batchwise_mean". "none": no reduction will be applied, "mean": the sum of the output will be divided by the number of elements in the output, "sum": the output will be summed.

Returns

computed loss

Source: https://github.com/BloodAxe/pytorch-toolbelt

catalyst.metrics.focal.reduced_focal_loss(outputs: torch.Tensor, targets: torch.Tensor, threshold: float = 0.5, gamma: float = 2.0, reduction='mean') → torch.Tensor[source]

Compute reduced focal loss between target and output logits.

It has been proposed in Reduced Focal Loss: 1st Place Solution to xView object detection in Satellite Imagery paper.

Note

size_average and reduce params are in the process of being deprecated, and in the meantime, specifying either of those two args will override reduction.

Source: https://github.com/BloodAxe/pytorch-toolbelt

Parameters
  • outputs – tensor of arbitrary shape

  • targets – tensor of the same shape as input

  • threshold – threshold for focal reduction

  • gamma – gamma for focal reduction

  • reduction (string, optional) – specifies the reduction to apply to the output: "none" | "mean" | "sum" | "batchwise_mean". "none": no reduction will be applied, "mean": the sum of the output will be divided by the number of elements in the output, "sum": the output will be summed. "batchwise_mean" computes mean loss per sample in batch. Default: “mean”

Returns: # noqa: DAR201

torch.Tensor: computed loss

Hitrate

Hitrate metric:
catalyst.metrics.hitrate.hitrate(outputs: torch.Tensor, targets: torch.Tensor, topk: List[int]) → List[torch.Tensor][source]

Calculate the hit rate score given model outputs and targets. Hit-rate is a metric for evaluating ranking systems. Generate top-N recommendations and if one of the recommendation is actually what user has rated, you consider that a hit. By rate we mean any explicit form of user’s interactions. Add up all of the hits for all users and then divide by number of users

Compute top-N recomendation for each user in the training stage and intentionally remove one of this items fro the training data.

Parameters
  • outputs (torch.Tensor) – Tensor weith predicted score size: [batch_size, slate_length] model outputs, logits

  • targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant for the user and 0 not relevant size: [batch_szie, slate_length] ground truth, labels

  • topk (List[int]) – Parameter fro evaluation on top-k items

Returns

the hit rate score

Return type

hitrate_at_k (List[torch.Tensor])

IoU

catalyst.metrics.iou.iou(outputs: torch.Tensor, targets: torch.Tensor, class_dim: int = 1, threshold: float = None, eps: float = 1e-07) → torch.Tensor[source]

Computes the iou/jaccard score.

Parameters
  • outputs – [N; K; …] tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.

  • targets – binary [N; K; …] tensort that encodes which of the K classes are associated with the N-th input

  • class_dim – indicates class dimention (K) for outputs and targets tensors (default = 1)

  • threshold – threshold for outputs binarization

  • eps – epsilon to avoid zero division

Returns

IoU (Jaccard) score

Examples

>>> size = 4
>>> half_size = size // 2
>>> shape = (1, 1, size, size)
>>> empty = torch.zeros(shape)
>>> full = torch.ones(shape)
>>> left = torch.ones(shape)
>>> left[:, :, :, half_size:] = 0
>>> right = torch.ones(shape)
>>> right[:, :, :, :half_size] = 0
>>> top_left = torch.zeros(shape)
>>> top_left[:, :, :half_size, :half_size] = 1
>>> pred = torch.cat([empty, left, empty, full, left, top_left], dim=1)
>>> targets = torch.cat([full, right, empty, full, left, left], dim=1)
>>> iou(
>>>     outputs=pred,
>>>     targets=targets,
>>>     class_dim=1,
>>>     threshold=0.5,
>>> )
tensor([0.0000, 0.0000, 1.0000, 1.0000, 1.0000, 0.5])
catalyst.metrics.iou.jaccard(outputs: torch.Tensor, targets: torch.Tensor, class_dim: int = 1, threshold: float = None, eps: float = 1e-07) → torch.Tensor

Computes the iou/jaccard score.

Parameters
  • outputs – [N; K; …] tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.

  • targets – binary [N; K; …] tensort that encodes which of the K classes are associated with the N-th input

  • class_dim – indicates class dimention (K) for outputs and targets tensors (default = 1)

  • threshold – threshold for outputs binarization

  • eps – epsilon to avoid zero division

Returns

IoU (Jaccard) score

Examples

>>> size = 4
>>> half_size = size // 2
>>> shape = (1, 1, size, size)
>>> empty = torch.zeros(shape)
>>> full = torch.ones(shape)
>>> left = torch.ones(shape)
>>> left[:, :, :, half_size:] = 0
>>> right = torch.ones(shape)
>>> right[:, :, :, :half_size] = 0
>>> top_left = torch.zeros(shape)
>>> top_left[:, :, :half_size, :half_size] = 1
>>> pred = torch.cat([empty, left, empty, full, left, top_left], dim=1)
>>> targets = torch.cat([full, right, empty, full, left, left], dim=1)
>>> iou(
>>>     outputs=pred,
>>>     targets=targets,
>>>     class_dim=1,
>>>     threshold=0.5,
>>> )
tensor([0.0000, 0.0000, 1.0000, 1.0000, 1.0000, 0.5])

MRR

catalyst.metrics.mrr.reciprocal_rank(outputs: torch.Tensor, targets: torch.Tensor, k: int) → torch.Tensor[source]

Calculate the Reciprocal Rank (MRR) score given model outputs and targets Data aggregated in batches.

Parameters
  • outputs – Tensor weith predicted score size: [batch_size, slate_length] model outputs, logits

  • targets – Binary tensor with ground truth. 1 means the item is relevant and 0 if it’s not relevant size: [batch_size, slate_length] ground truth, labels

  • k – Parameter for evaluation on top-k items

Returns

MRR score

Examples

>>> reciprocal_rank(
>>>     outputs=torch.Tensor([
>>>         [4.0, 2.0, 3.0, 1.0],
>>>         [1.0, 2.0, 3.0, 4.0],
>>>     ]),
>>>     targets=torch.Tensor([
>>>         [0, 0, 1.0, 1.0],
>>>         [0, 0, 1.0, 1.0],
>>>     ]),
>>>     k=1,
>>> )
tensor([[0.], [1.]])
>>> reciprocal_rank(
>>>     outputs=torch.Tensor([
>>>         [4.0, 2.0, 3.0, 1.0],
>>>         [1.0, 2.0, 3.0, 4.0],
>>>     ]),
>>>     targets=torch.Tensor([
>>>         [0, 0, 1.0, 1.0],
>>>         [0, 0, 1.0, 1.0],
>>>     ]),
>>>     k=3,
>>> )
tensor([[0.5000], [1.0000]])
catalyst.metrics.mrr.mrr(outputs: torch.Tensor, targets: torch.Tensor, topk: List[int]) → List[torch.Tensor][source]

Calculate the Mean Reciprocal Rank (MRR) score given model outputs and targets Data aggregated in batches.

The MRR@k is the mean overall batch of the reciprocal rank, that is the rank of the highest ranked relevant item, if any in the top k, 0 otherwise. https://en.wikipedia.org/wiki/Mean_reciprocal_rank

Parameters
  • outputs – Tensor weith predicted score size: [batch_size, slate_length] model outputs, logits

  • targets – Binary tensor with ground truth. 1 means the item is relevant and 0 if it’s not relevant size: [batch_szie, slate_length] ground truth, labels

  • topk – Parameter fro evaluation on top-k items

Returns

MRR score

Examples

>>> mrr(
>>>     outputs=torch.Tensor([
>>>         [4.0, 2.0, 3.0, 1.0],
>>>         [1.0, 2.0, 3.0, 4.0],
>>>     ]),
>>>     targets=torch.Tensor([
>>>         [0, 0, 1.0, 1.0],
>>>         [0, 0, 1.0, 1.0],
>>>     ]),
>>>     k=[1, 3],
>>> )
[tensor(0.5000), tensor(0.7500)]

MAP

MAP metric.

catalyst.metrics.avg_precision.mean_avg_precision(outputs: torch.Tensor, targets: torch.Tensor, topk: List[int]) → List[torch.Tensor][source]

Calculate the mean average precision (MAP) for RecSys. The metrics calculate the mean of the AP across all batches

MAP amplifies the interest in finding many relevant items for each query

Parameters
  • outputs (torch.Tensor) – Tensor with predicted score size: [batch_size, slate_length] model outputs, logits

  • targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant and 0 not relevant size: [batch_szie, slate_length] ground truth, labels

  • topk (List[int]) – List of parameter for evaluation topK items

Returns

The map score for every k. size: len(top_k)

Return type

map_at_k (Tuple[float])

Examples

>>> mean_avg_precision(
>>>     outputs=torch.tensor([
>>>         [9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
>>>         [9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
>>>     ]),
>>>     targets=torch.tensor([
>>>         [1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0],
>>>         [0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0],
>>>     ]),
>>>     topk=[10],
>>> )
[tensor(0.5325)]
catalyst.metrics.avg_precision.avg_precision(outputs: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]

Calculate the Average Precision for RecSys. The precision metric summarizes the fraction of relevant items out of the whole the recommendation list.

To compute the precision at k set the threshold rank k, compute the percentage of relevant items in topK, ignoring the documents ranked lower than k.

The average precision at k (AP at k) summarizes the average precision for relevant items up to the k-th one. Wikipedia entry for the Average precision

<https://en.wikipedia.org/w/index.php?title=Information_retrieval& oldid=793358396#Average_precision>

If a relevant document never gets retrieved, we assume the precision corresponding to that relevant doc to be zero

Parameters
  • outputs (torch.Tensor) – Tensor with predicted score size: [batch_size, slate_length] model outputs, logits

  • targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant and 0 not relevant size: [batch_szie, slate_length] ground truth, labels

Returns

The map score for each batch. size: [batch_size, 1]

Return type

ap_score (torch.Tensor)

Examples

>>> avg_precision(
>>>     outputs=torch.tensor([
>>>         [9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
>>>         [9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
>>>     ]),
>>>     targets=torch.tensor([
>>>         [1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0],
>>>         [0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0],
>>>     ]),
>>> )
tensor([0.6222, 0.4429])

NDCG

catalyst.metrics.ndcg.dcg(outputs: torch.Tensor, targets: torch.Tensor, gain_function='exp_rank') → torch.Tensor[source]

Computes DCG@topk for the specified values of k. Graded relevance as a measure of usefulness, or gain, from examining a set of items. Gain may be reduced at lower ranks. Reference: https://en.wikipedia.org/wiki/Discounted_cumulative_gain

Parameters
  • outputs – model outputs, logits with shape [batch_size; slate_length]

  • targets – ground truth, labels with shape [batch_size; slate_length]

  • gain_function – String indicates the gain function for the ground truth labels. Two options available: - exp_rank: torch.pow(2, x) - 1 - linear_rank: x On the default, exp_rank is used to emphasize on retrieving the relevant documents.

Returns

The discounted gains tensor

Return type

dcg_score (torch.Tensor)

Raises

ValueError – gain function can be either pow_rank or rank

Examples

>>> dcg(
>>>     outputs = torch.tensor([
>>>         [3, 2, 1, 0],
>>>     ]),
>>>     targets = torch.Tensor([
>>>         [2.0, 2.0, 1.0, 0.0],
>>>     ]),
>>>     gain_function="linear_rank",
>>> )
tensor([[2.0000, 2.0000, 0.6309, 0.0000]])
>>> dcg(
>>>     outputs = torch.tensor([
>>>         [3, 2, 1, 0],
>>>     ]),
>>>     targets = torch.Tensor([
>>>         [2.0, 2.0, 1.0, 0.0],
>>>     ]),
>>>     gain_function="linear_rank",
>>> ).sum()
tensor(4.6309)
>>> dcg(
>>>     outputs = torch.tensor([
>>>         [3, 2, 1, 0],
>>>     ]),
>>>     targets = torch.Tensor([
>>>         [2.0, 2.0, 1.0, 0.0],
>>>     ]),
>>>     gain_function="exp_rank",
>>> )
tensor([[3.0000, 1.8928, 0.5000, 0.0000]])
>>> dcg(
>>>     outputs = torch.tensor([
>>>         [3, 2, 1, 0],
>>>     ]),
>>>     targets = torch.Tensor([
>>>         [2.0, 2.0, 1.0, 0.0],
>>>     ]),
>>>     gain_function="exp_rank",
>>> ).sum()
tensor(5.3928)
catalyst.metrics.ndcg.ndcg(outputs: torch.Tensor, targets: torch.Tensor, topk: List[int], gain_function='exp_rank') → List[torch.Tensor][source]

Computes nDCG@topk for the specified values of topk.

Parameters
  • outputs (torch.Tensor) – model outputs, logits with shape [batch_size; slate_size]

  • targets (torch.Tensor) – ground truth, labels with shape [batch_size; slate_size]

  • gain_function – callable, gain function for the ground truth labels. Two options available: - exp_rank: torch.pow(2, x) - 1 - linear_rank: x On the default, exp_rank is used to emphasize on retrieving the relevant documents.

  • topk (List[int]) – Parameter fro evaluation on top-k items

Returns

tuple with computed ndcg@topk

Return type

results (Tuple[float])

Examples

>>> ndcg(
>>>     outputs = torch.tensor([
>>>         [0.5, 0.2, 0.1],
>>>         [0.5, 0.2, 0.1],
>>>     ]),
>>>     targets = torch.Tensor([
>>>         [1.0, 0.0, 1.0],
>>>         [1.0, 0.0, 1.0],
>>>     ]),
>>>     topk=[2],
>>>     gain_function="exp_rank",
>>> )
[tensor(0.6131)]
>>> ndcg(
>>>     outputs = torch.tensor([
>>>         [0.5, 0.2, 0.1],
>>>         [0.5, 0.2, 0.1],
>>>     ]),
>>>     targets = torch.Tensor([
>>>         [1.0, 0.0, 1.0],
>>>         [1.0, 0.0, 1.0],
>>>     ]),
>>>     topk=[2],
>>>     gain_function="exp_rank",
>>> )
[tensor(0.5000)]

DCG

Computes DCG@topk for the specified values of k. Graded relevance as a measure of usefulness, or gain, from examining a set of items. Gain may be reduced at lower ranks. Reference: https://en.wikipedia.org/wiki/Discounted_cumulative_gain

param outputs

model outputs, logits with shape [batch_size; slate_length]

param targets

ground truth, labels with shape [batch_size; slate_length]

param gain_function

String indicates the gain function for the ground truth labels. Two options available: - exp_rank: torch.pow(2, x) - 1 - linear_rank: x On the default, exp_rank is used to emphasize on retrieving the relevant documents.

returns

The discounted gains tensor

rtype

dcg_score (torch.Tensor)

raises ValueError

gain function can be either pow_rank or rank

Examples

>>> dcg(
>>>     outputs = torch.tensor([
>>>         [3, 2, 1, 0],
>>>     ]),
>>>     targets = torch.Tensor([
>>>         [2.0, 2.0, 1.0, 0.0],
>>>     ]),
>>>     gain_function="linear_rank",
>>> )
tensor([[2.0000, 2.0000, 0.6309, 0.0000]])
>>> dcg(
>>>     outputs = torch.tensor([
>>>         [3, 2, 1, 0],
>>>     ]),
>>>     targets = torch.Tensor([
>>>         [2.0, 2.0, 1.0, 0.0],
>>>     ]),
>>>     gain_function="linear_rank",
>>> ).sum()
tensor(4.6309)
>>> dcg(
>>>     outputs = torch.tensor([
>>>         [3, 2, 1, 0],
>>>     ]),
>>>     targets = torch.Tensor([
>>>         [2.0, 2.0, 1.0, 0.0],
>>>     ]),
>>>     gain_function="exp_rank",
>>> )
tensor([[3.0000, 1.8928, 0.5000, 0.0000]])
>>> dcg(
>>>     outputs = torch.tensor([
>>>         [3, 2, 1, 0],
>>>     ]),
>>>     targets = torch.Tensor([
>>>         [2.0, 2.0, 1.0, 0.0],
>>>     ]),
>>>     gain_function="exp_rank",
>>> ).sum()
tensor(5.3928)

Recall

catalyst.metrics.recall.recall(outputs: torch.Tensor, targets: torch.Tensor, argmax_dim: int = -1, eps: float = 1e-07, num_classes: Optional[int] = None) → Union[float, torch.Tensor][source]

Multiclass precision metric.

Parameters
  • outputs – estimated targets as predicted by a model with shape [bs; …, (num_classes or 1)]

  • targets – ground truth (correct) target values with shape [bs; …, 1]

  • argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in outputs

  • eps – float. Epsilon to avoid zero division.

  • num_classes – int, that specifies number of classes if it known.

Returns

recall for every class

Return type

Tensor

Examples

>>> recall(
>>>     outputs=torch.tensor([
>>>         [1, 0, 0],
>>>         [0, 1, 0],
>>>         [0, 0, 1],
>>>     ]),
>>>     targets=torch.tensor([0, 1, 2]),
>>> )
tensor([1., 1., 1.])
>>> precision_recall_fbeta_support(
>>>     outputs=torch.tensor([[0, 0, 1, 1, 0, 1, 0, 1]]),
>>>     targets=torch.tensor([[0, 1, 0, 1, 0, 0, 1, 1]]),
>>> )
tensor([0.5000, 0.5000])

Functional

catalyst.metrics.functional.check_consistent_length(*tensors)[source]

Check that all arrays have consistent first dimensions. Checks whether all objects in arrays have the same shape or length.

Parameters

tensors – list or tensors of input objects. Objects that will be checked for consistent length.

Raises

ValueError – “Inconsistent numbers of samples”

catalyst.metrics.functional.process_multilabel_components(outputs: torch.Tensor, targets: torch.Tensor, weights: Optional[torch.Tensor] = None) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor][source]

General preprocessing for multilabel-based metrics.

Parameters
  • outputs – NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.

  • targets – binary NxK tensor that encodes which of the K classes are associated with the N-th input (eg: a row [0, 1, 0, 1] indicates that the example is associated with classes 2 and 4)

  • weights – importance for each sample

Returns

processed outputs and targets with [batch_size; num_classes] shape

catalyst.metrics.functional.process_recsys_components(outputs: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]

General pre-processing for calculation recsys metrics

Parameters
  • outputs (torch.Tensor) – Tensor weith predicted score size: [batch_size, slate_length] model outputs, logits

  • targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant for the user and 0 not relevant size: [batch_szie, slate_length] ground truth, labels

Returns

targets tensor sorted by outputs

Return type

targets_sorted_by_outputs (torch.Tensor)

catalyst.metrics.functional.get_binary_statistics(outputs: torch.Tensor, targets: torch.Tensor, label: int = 1) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]

Computes the number of true negative, false positive, false negative, true negative and support for a binary classification problem for a given label.

Parameters
  • outputs – estimated targets as predicted by a model with shape [bs; …, 1]

  • targets – ground truth (correct) target values with shape [bs; …, 1]

  • label – integer, that specifies label of interest for statistics compute

Returns

stats

Return type

Tuple[Tensor, Tensor, Tensor, Tensor, Tensor]

Example

>>> y_pred = torch.tensor([[0, 0, 1, 1, 0, 1, 0, 1]])
>>> y_true = torch.tensor([[0, 1, 0, 1, 0, 0, 1, 1]])
>>> tn, fp, fn, tp, support = get_binary_statistics(y_pred, y_true)
tensor(2) tensor(2) tensor(2) tensor(2) tensor(4)
catalyst.metrics.functional.get_multiclass_statistics(outputs: torch.Tensor, targets: torch.Tensor, argmax_dim: int = -1, num_classes: Optional[int] = None) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]

Computes the number of true negative, false positive, false negative, true negative and support for a multiclass classification problem.

Parameters
  • outputs – estimated targets as predicted by a model with shape [bs; …, (num_classes or 1)]

  • targets – ground truth (correct) target values with shape [bs; …, 1]

  • argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in outputs

  • num_classes – int, that specifies number of classes if it known

Returns

stats

Return type

Tuple[Tensor, Tensor, Tensor, Tensor, Tensor]

Example

>>> y_pred = torch.tensor([1, 2, 3, 0])
>>> y_true = torch.tensor([1, 3, 4, 0])
>>> tn, fp, fn, tp, support = get_multiclass_statistics(y_pred, y_true)
tensor([3., 3., 3., 2., 3.]), tensor([0., 0., 1., 1., 0.]),
tensor([0., 0., 0., 1., 1.]), tensor([1., 1., 0., 0., 0.]),
tensor([1., 1., 0., 1., 1.])
catalyst.metrics.functional.get_multilabel_statistics(outputs: torch.Tensor, targets: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]

Computes the number of true negative, false positive, false negative, true negative and support for a multilabel classification problem.

Parameters
  • outputs – estimated targets as predicted by a model with shape [bs; …, (num_classes or 1)]

  • targets – ground truth (correct) target values with shape [bs; …, 1]

Returns

stats

Return type

Tuple[Tensor, Tensor, Tensor, Tensor, Tensor]

Example

>>> y_pred = torch.tensor([[0, 0, 1, 1], [0, 1, 0, 1]])
>>> y_true = torch.tensor([[0, 1, 0, 1], [0, 0, 1, 1]])
>>> tn, fp, fn, tp, support = get_multilabel_statistics(y_pred, y_true)
tensor([2., 0., 0., 0.]) tensor([0., 1., 1., 0.]),
tensor([0., 1., 1., 0.]) tensor([0., 0., 0., 2.]),
tensor([0., 1., 1., 2.])
>>> y_pred = torch.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
>>> y_true = torch.tensor([0, 1, 2])
>>> tn, fp, fn, tp, support = get_multilabel_statistics(y_pred, y_true)
tensor([2., 2., 2.]) tensor([0., 0., 0.])
tensor([0., 0., 0.]) tensor([1., 1., 1.])
tensor([1., 1., 1.])
>>> y_pred = torch.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
>>> y_true = torch.nn.functional.one_hot(torch.tensor([0, 1, 2]))
>>> tn, fp, fn, tp, support = get_multilabel_statistics(y_pred, y_true)
tensor([2., 2., 2.]) tensor([0., 0., 0.])
tensor([0., 0., 0.]) tensor([1., 1., 1.])
tensor([1., 1., 1.])
catalyst.metrics.functional.get_default_topk_args(num_classes: int) → Sequence[int][source]

Calculate list params for Accuracy@k and mAP@k.

Parameters

num_classes – number of classes

Returns

array of accuracy arguments

Return type

iterable

Examples

>>> get_default_topk_args(num_classes=4)
[1, 3]
>>> get_default_topk_args(num_classes=8)
[1, 3, 5]
catalyst.metrics.functional.wrap_metric_fn_with_activation(metric_fn: Callable, activation: str = None)[source]

Wraps model outputs for metric_fn` with specified ``activation.

Parameters
  • metric_fn – metric function to compute

  • activation – activation name to use

Returns

wrapped metric function with wrapped model outputs

Note

Works only with metric_fn like metric_fn(outputs, targets, *args, **kwargs).

catalyst.metrics.functional.wrap_topk_metric2dict(metric_fn: Callable, topk_args: Sequence[int]) → Callable[source]

Logging wrapper for metrics with Sequence[Union[torch.Tensor, int, float, Dict]] output. Computes the metric and sync each element from the output sequence with passed topk argument.

Parameters
  • metric_fn – metric function to compute

  • topk_args – topk args to sync outputs with

Returns

wrapped metric function with List[Dict] output

catalyst.metrics.functional.wrap_class_metric2dict(metric_fn: Callable, per_class: bool = False, class_args: Sequence[str] = None) → Callable[source]

# noqa: D202 Logging wrapper for metrics with torch.Tensor output and [num_classes] shape. Computes the metric and sync each element from the output Tensor with passed class argument.

Parameters
  • metric_fn – metric function to compute

  • per_class – boolean flag to log per class metrics, or use mean/macro statistics otherwise

  • class_args – class names for logging, default: None - class indexes will be used.

Returns

wrapped metric function with List[Dict] output