Metrics¶
Accuracy¶
- Various accuracy metrics:
- 
catalyst.metrics.accuracy.accuracy(outputs: torch.Tensor, targets: torch.Tensor, topk: Sequence[int] = (1, ), activation: Optional[str] = None) → Sequence[torch.Tensor][source]¶
- Computes multi-class accuracy@topk for the specified values of topk. - Parameters
- outputs – model outputs, logits with shape [bs; num_classes] 
- targets – ground truth, labels with shape [bs; 1] 
- activation – activation to use for model output 
- topk – topk for accuracy@topk computing 
 
- Returns
- list with computed accuracy@topk 
 
- 
catalyst.metrics.accuracy.multi_label_accuracy(outputs: torch.Tensor, targets: torch.Tensor, threshold: Union[float, torch.Tensor], activation: Optional[str] = None) → torch.Tensor[source]¶
- Computes multi-label accuracy for the specified activation and threshold. - Parameters
- outputs – NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model. 
- targets – binary NxK tensort that encodes which of the K classes are associated with the N-th input (eg: a row [0, 1, 0, 1] indicates that the example is associated with classes 2 and 4) 
- threshold – threshold for for model output 
- activation – activation to use for model output 
 
- Returns
- computed multi-label accuracy 
 
AUC¶
- 
catalyst.metrics.auc.auc(outputs: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]¶
- AUC metric. - Parameters
- outputs – [bs; num_classes] estimated scores from a model. 
- targets – [bs; num_classes] ground truth (correct) target values. 
 
- Returns
- Tensor with [num_classes] shape of per-class-aucs 
- Return type
- torch.Tensor 
 
CMC score¶
- 
catalyst.metrics.cmc_score.cmc_score_count(distances: torch.Tensor, conformity_matrix: torch.Tensor, topk: int = 1) → float[source]¶
- Function to count CMC from distance matrix and conformity matrix. - Parameters
- distances – distance matrix shape of (n_embeddings_x, n_embeddings_y) 
- conformity_matrix – binary matrix with 1 on same label pos and 0 otherwise 
- topk – number of top examples for cumulative score counting 
 
- Returns
- cmc score 
 
- 
catalyst.metrics.cmc_score.cmc_score(query_embeddings: torch.Tensor, gallery_embeddings: torch.Tensor, conformity_matrix: torch.Tensor, topk: int = 1) → float[source]¶
- Function to count CMC score from query and gallery embeddings. - Parameters
- query_embeddings – tensor shape of (n_embeddings, embedding_dim) embeddings of the objects in querry 
- gallery_embeddings – tensor shape of (n_embeddings, embedding_dim) embeddings of the objects in gallery 
- conformity_matrix – binary matrix with 1 on same label pos and 0 otherwise 
- topk – number of top examples for cumulative score counting 
 
- Returns
- cmc score 
 
Dice¶
Dice metric.
- 
catalyst.metrics.dice.dice(outputs: torch.Tensor, targets: torch.Tensor, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]¶
- Computes the dice metric. - Parameters
- outputs – a list of predicted elements 
- targets – a list of elements that are to be predicted 
- eps – epsilon 
- threshold – threshold for outputs binarization 
- activation – An torch.nn activation applied to the outputs. Must be one of [“none”, “Sigmoid”, “Softmax2d”] 
 
- Returns
- Dice score 
- Return type
- float 
 
- 
catalyst.metrics.dice.calculate_dice(true_positives: numpy.array, false_positives: numpy.array, false_negatives: numpy.array) → numpy.array[source]¶
- Calculate list of Dice coefficients. - Parameters
- true_positives – true positives numpy tensor 
- false_positives – false positives numpy tensor 
- false_negatives – false negatives numpy tensor 
 
- Returns
- dice score 
- Return type
- np.array 
- Raises
- ValueError – if dice is out of [0; 1] bounds 
 
F1 score¶
F1 score.
- 
catalyst.metrics.f1_score.f1_score(outputs: torch.Tensor, targets: torch.Tensor, eps: float = 1e-07, argmax_dim: int = -1, num_classes: Optional[int] = None) → Union[float, torch.Tensor][source]¶
- Fbeta_score with beta=1. - Parameters
- outputs – A list of predicted elements 
- targets – A list of elements that are to be predicted 
- eps – epsilon to avoid zero division 
- argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in - outputs
- num_classes – int, that specifies number of classes if it known 
 
- Returns
- F_1 score 
- Return type
- float 
 
- 
catalyst.metrics.f1_score.fbeta_score(outputs: torch.Tensor, targets: torch.Tensor, beta: float = 1.0, eps: float = 1e-07, argmax_dim: int = -1, num_classes: Optional[int] = None) → Union[float, torch.Tensor][source]¶
- Counts fbeta score for given - outputsand- targets.- Parameters
- outputs – A list of predicted elements 
- targets – A list of elements that are to be predicted 
- beta – beta param for f_score 
- eps – epsilon to avoid zero division 
- argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in - outputs
- num_classes – int, that specifies number of classes if it known 
 
- Raises
- Exception – If - betais a negative number.
- Returns
- F_1 score. 
- Return type
- float 
 
Focal¶
- Focal losses:
- 
catalyst.metrics.focal.sigmoid_focal_loss(outputs: torch.Tensor, targets: torch.Tensor, gamma: float = 2.0, alpha: float = 0.25, reduction: str = 'mean')[source]¶
- Compute binary focal loss between target and output logits. - Parameters
- outputs – tensor of arbitrary shape 
- targets – tensor of the same shape as input 
- gamma – gamma for focal loss 
- alpha – alpha for focal loss 
- reduction (string, optional) – specifies the reduction to apply to the output: - "none"|- "mean"|- "sum"|- "batchwise_mean".- "none": no reduction will be applied,- "mean": the sum of the output will be divided by the number of elements in the output,- "sum": the output will be summed.
 
- Returns
- computed loss 
 
- 
catalyst.metrics.focal.reduced_focal_loss(outputs: torch.Tensor, targets: torch.Tensor, threshold: float = 0.5, gamma: float = 2.0, reduction='mean') → torch.Tensor[source]¶
- Compute reduced focal loss between target and output logits. - It has been proposed in Reduced Focal Loss: 1st Place Solution to xView object detection in Satellite Imagery paper. - Note - size_averageand- reduceparams are in the process of being deprecated, and in the meantime, specifying either of those two args will override- reduction.- Source: https://github.com/BloodAxe/pytorch-toolbelt - Parameters
- outputs – tensor of arbitrary shape 
- targets – tensor of the same shape as input 
- threshold – threshold for focal reduction 
- gamma – gamma for focal reduction 
- reduction (string, optional) – specifies the reduction to apply to the output: - "none"|- "mean"|- "sum"|- "batchwise_mean".- "none": no reduction will be applied,- "mean": the sum of the output will be divided by the number of elements in the output,- "sum": the output will be summed.- "batchwise_mean"computes mean loss per sample in batch. Default: “mean”
 
 - Returns: # noqa: DAR201
- torch.Tensor: computed loss 
 
Hitrate¶
- Hitrate metric:
- 
catalyst.metrics.hitrate.hitrate(outputs: torch.Tensor, targets: torch.Tensor, k=10) → torch.Tensor[source]¶
- Calculate the hit rate score given model outputs and targets. Hit-rate is a metric for evaluating ranking systems. Generate top-N recommendations and if one of the recommendation is actually what user has rated, you consider that a hit. By rate we mean any explicit form of user’s interactions. Add up all of the hits for all users and then divide by number of users - Compute top-N recomendation for each user in the training stage and intentionally remove one of this items fro the training data. - Parameters
- outputs (torch.Tensor) – Tensor weith predicted score size: [batch_size, slate_length] model outputs, logits 
- targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant for the user and 0 not relevant size: [batch_szie, slate_length] ground truth, labels 
- k (int) – Parameter fro evaluation on top-k items 
 
- Returns
- the hit rate score 
- Return type
- hitrate (torch.Tensor) 
 
IoU¶
IoU metric. Jaccard metric refers to IoU here, same functionality.
- 
catalyst.metrics.iou.iou(outputs: torch.Tensor, targets: torch.Tensor, classes: List[str] = None, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid') → torch.Tensor[source]¶
- Parameters
- outputs – A list of predicted elements 
- targets – A list of elements that are to be predicted 
- classes – if classes are specified we reduce across all dims except channels 
- eps – epsilon to avoid zero division 
- threshold – threshold for outputs binarization 
- activation – An torch.nn activation applied to the outputs. Must be one of [“none”, “Sigmoid”, “Softmax2d”] 
 
- Returns
- IoU (Jaccard) score(s) 
- Return type
- Union[float, List[float]] 
 
- 
catalyst.metrics.iou.jaccard(outputs: torch.Tensor, targets: torch.Tensor, classes: List[str] = None, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid') → torch.Tensor¶
- Parameters
- outputs – A list of predicted elements 
- targets – A list of elements that are to be predicted 
- classes – if classes are specified we reduce across all dims except channels 
- eps – epsilon to avoid zero division 
- threshold – threshold for outputs binarization 
- activation – An torch.nn activation applied to the outputs. Must be one of [“none”, “Sigmoid”, “Softmax2d”] 
 
- Returns
- IoU (Jaccard) score(s) 
- Return type
- Union[float, List[float]] 
 
MRR¶
MRR metric.
- 
catalyst.metrics.mrr.mrr(outputs: torch.Tensor, targets: torch.Tensor, k=100) → torch.Tensor[source]¶
- Calculate the Mean Reciprocal Rank (MRR) score given model ouptputs and targets User’s data aggreagtesd in batches. - The MRR@k is the mean overall user of the reciprocal rank, that is the rank of the highest ranked relevant item, if any in the top k, 0 otherwise. https://en.wikipedia.org/wiki/Mean_reciprocal_rank - Parameters
- outputs (torch.Tensor) – Tensor weith predicted score size: [batch_size, slate_length] model outputs, logits 
- targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant for the user and 0 not relevant size: [batch_szie, slate_length] ground truth, labels 
- k (int) – Parameter fro evaluation on top-k items 
 
- Returns
- The mrr score for each user. 
- Return type
- result (torch.Tensor) 
 
MAP¶
MAP metric.
- 
catalyst.metrics.avg_precision.mean_avg_precision(outputs: torch.Tensor, targets: torch.Tensor, top_k: List[int]) → Dict[str, int][source]¶
- Calculate the mean average precision (MAP) for RecSys. The metrics calculate the mean of the AP across all batches - MAP amplifies the interest in finding many relevant items for each query - Parameters
- outputs (torch.Tensor) – Tensor weith predicted score size: [batch_size, slate_length] model outputs, logits 
- targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant and 0 not relevant size: [batch_szie, slate_length] ground truth, labels 
- top_k (List[int]) – List of parameter for evaluation topK items 
 
- Returns
- The map score for every k. size: [len(top_k), 1] 
- Return type
- result (Dict[str, int]) 
 
- 
catalyst.metrics.avg_precision.avg_precision(outputs: torch.Tensor, targets: torch.Tensor, k=10) → torch.Tensor[source]¶
- Calculate the Average Precision for RecSys. The precision metric summarizes the fraction of relevant items out of the whole the recommendation list. - To compute the precision at k set the threshold rank k, compute the percentage of relevant items in topK, ignoring the documents ranked lower than k. - The average precision at k (AP at k) summarizes the average precision for relevant items up to the k-th one. Wikipedia entry for the Average precision - <https://en.wikipedia.org/w/index.php?title=Information_retrieval& oldid=793358396#Average_precision> - If a relevant document never gets retrieved, we assume the precision corresponding to that relevant doc to be zero - Parameters
- outputs (torch.Tensor) – Tensor weith predicted score size: [batch_size, slate_length] model outputs, logits 
- targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant and 0 not relevant size: [batch_szie, slate_length] ground truth, labels 
- k (int) – The position to compute the truncated AP, must be positive 
 
- Returns
- The map score for each batch. size: [batch_size, 1] 
- Return type
- result (torch.Tensor) 
 
NDCG¶
Discounted Cumulative Gain metrics
- 
catalyst.metrics.ndcg.dcg(outputs: torch.Tensor, targets: torch.Tensor, k=10, gain_function='pow_rank') → torch.Tensor[source]¶
- Computes DCG@topk for the specified values of k. Graded relevance as a measure of usefulness, or gain, from examining a set of items. Gain may be reduced at lower ranks. Reference: https://en.wikipedia.org/wiki/Discounted_cumulative_gain - Parameters
- outputs (torch.Tensor) – model outputs, logits with shape [batch_size; slate_length] 
- targets (torch.Tensor) – ground truth, labels with shape [batch_size; slate_length] 
- gain_function – String indicates the gain function for the ground truth labels. Two options available: - pow_rank: torch.pow(2, x) - 1 - rank: x On the default, pow_rank is used to emphasize on retrievng the relevant documents. 
- k (int) – Parameter fro evaluation on top-k items 
 
- Returns
- torch.Tensor for dcg at k 
- Raises
- ValueError – gain function can be either pow_rank or rank 
 
- 
catalyst.metrics.ndcg.ndcg(outputs: torch.Tensor, targets: torch.Tensor, top_k: List[int], gain_function='pow_rank') → torch.Tensor[source]¶
- Computes nDCG@topk for the specified values of top_k. - Parameters
- outputs (torch.Tensor) – model outputs, logits with shape [batch_size; slate_size] 
- targets (torch.Tensor) – ground truth, labels with shape [batch_size; slate_size] 
- gain_function – callable, gain function for the ground truth labels. on the deafult, the torch.pow(2, x) - 1 function used to get emphasise on the retirvng the revelant documnets. 
- top_k (List[int]) – Parameter fro evaluation on top-k items 
 
- Returns
- tuple with computed ndcg@topk 
 
Precision¶
- 
catalyst.metrics.precision.average_precision(outputs: torch.Tensor, targets: torch.Tensor, weights: Optional[torch.Tensor] = None) → torch.Tensor[source]¶
- Computes the average precision. - Parameters
- outputs – NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model. 
- targets – binary NxK tensort that encodes which of the K classes are associated with the N-th input (eg: a row [0, 1, 0, 1] indicates that the example is associated with classes 2 and 4) 
- weights – importance for each sample 
 
- Returns
- tensor of [K; ] shape, with average precision for K classes 
- Return type
- torch.Tensor 
 
- 
catalyst.metrics.precision.precision(outputs: torch.Tensor, targets: torch.Tensor, argmax_dim: int = -1, eps: float = 1e-07, num_classes: Optional[int] = None) → Union[float, torch.Tensor][source]¶
- Multiclass precision metric. - Parameters
- outputs – estimated targets as predicted by a model with shape [bs; …, (num_classes or 1)] 
- targets – ground truth (correct) target values with shape [bs; …, 1] 
- argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in - outputs
- eps – float. Epsilon to avoid zero division. 
- num_classes – int, that specifies number of classes if it known 
 
- Returns
- Return type
- Tensor 
 
Recall¶
- 
catalyst.metrics.recall.recall(outputs: torch.Tensor, targets: torch.Tensor, argmax_dim: int = -1, eps: float = 1e-07, num_classes: Optional[int] = None) → Union[float, torch.Tensor][source]¶
- Multiclass precision metric. - Parameters
- outputs – estimated targets as predicted by a model with shape [bs; …, (num_classes or 1)] 
- targets – ground truth (correct) target values with shape [bs; …, 1] 
- argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in - outputs
- eps – float. Epsilon to avoid zero division. 
- num_classes – int, that specifies number of classes if it known. 
 
- Returns
- recall for every class 
- Return type
- Tensor 
 
Functional¶
- 
catalyst.metrics.functional.process_multilabel_components(outputs: torch.Tensor, targets: torch.Tensor, weights: Optional[torch.Tensor] = None) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor][source]¶
- General preprocessing for multi-label-based metrics. - Parameters
- outputs – NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model. 
- targets – binary NxK tensor that encodes which of the K classes are associated with the N-th input (eg: a row [0, 1, 0, 1] indicates that the example is associated with classes 2 and 4) 
- weights – importance for each sample 
 
- Returns
- processed - outputsand- targetswith [batch_size; num_classes] shape
 
- 
catalyst.metrics.functional.get_binary_statistics(outputs: torch.Tensor, targets: torch.Tensor, label: int = 1) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]¶
- Computes the number of true negative, false positive, false negative, true negative and support for a binary classification problem for a given label. - Parameters
- outputs – estimated targets as predicted by a model with shape [bs; …, 1] 
- targets – ground truth (correct) target values with shape [bs; …, 1] 
- label – integer, that specifies label of interest for statistics compute 
 
- Returns
- stats 
- Return type
- Tuple[Tensor, Tensor, Tensor, Tensor, Tensor] 
 - Example - >>> y_pred = torch.tensor([[0, 0, 1, 1, 0, 1, 0, 1]]) >>> y_true = torch.tensor([[0, 1, 0, 1, 0, 0, 1, 1]]) >>> tn, fp, fn, tp, support = get_binary_statistics(y_pred, y_true) tensor(2) tensor(2) tensor(2) tensor(2) tensor(4) 
- 
catalyst.metrics.functional.get_multiclass_statistics(outputs: torch.Tensor, targets: torch.Tensor, argmax_dim: int = -1, num_classes: Optional[int] = None) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]¶
- Computes the number of true negative, false positive, false negative, true negative and support for a multi-class classification problem. - Parameters
- outputs – estimated targets as predicted by a model with shape [bs; …, (num_classes or 1)] 
- targets – ground truth (correct) target values with shape [bs; …, 1] 
- argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in - outputs
- num_classes – int, that specifies number of classes if it known 
 
- Returns
- stats 
- Return type
- Tuple[Tensor, Tensor, Tensor, Tensor, Tensor] 
 - Example - >>> y_pred = torch.tensor([1, 2, 3, 0]) >>> y_true = torch.tensor([1, 3, 4, 0]) >>> tn, fp, fn, tp, support = get_multiclass_statistics(y_pred, y_true) tensor([3., 3., 3., 2., 3.]), tensor([0., 0., 1., 1., 0.]), tensor([0., 0., 0., 1., 1.]), tensor([1., 1., 0., 0., 0.]), tensor([1., 1., 0., 1., 1.]) 
- 
catalyst.metrics.functional.get_multilabel_statistics(outputs: torch.Tensor, targets: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]¶
- Computes the number of true negative, false positive, false negative, true negative and support for a multi-label classification problem. - Parameters
- outputs – estimated targets as predicted by a model with shape [bs; …, (num_classes or 1)] 
- targets – ground truth (correct) target values with shape [bs; …, 1] 
 
- Returns
- stats 
- Return type
- Tuple[Tensor, Tensor, Tensor, Tensor, Tensor] 
 - Example - >>> y_pred = torch.tensor([[0, 0, 1, 1], [0, 1, 0, 1]]) >>> y_true = torch.tensor([[0, 1, 0, 1], [0, 0, 1, 1]]) >>> tn, fp, fn, tp, support = get_multilabel_statistics(y_pred, y_true) tensor([2., 0., 0., 0.]) tensor([0., 1., 1., 0.]), tensor([0., 1., 1., 0.]) tensor([0., 0., 0., 2.]), tensor([0., 1., 1., 2.]) - >>> y_pred = torch.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) >>> y_true = torch.tensor([0, 1, 2]) >>> tn, fp, fn, tp, support = get_multilabel_statistics(y_pred, y_true) tensor([2., 2., 2.]) tensor([0., 0., 0.]) tensor([0., 0., 0.]) tensor([1., 1., 1.]) tensor([1., 1., 1.]) - >>> y_pred = torch.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) >>> y_true = torch.nn.functional.one_hot(torch.tensor([0, 1, 2])) >>> tn, fp, fn, tp, support = get_multilabel_statistics(y_pred, y_true) tensor([2., 2., 2.]) tensor([0., 0., 0.]) tensor([0., 0., 0.]) tensor([1., 1., 1.]) tensor([1., 1., 1.]) 
- 
catalyst.metrics.functional.get_default_topk_args(num_classes: int) → Sequence[int][source]¶
- Calculate list params for - Accuracy@kand- mAP@k.- Parameters
- num_classes – number of classes 
- Returns
- array of accuracy arguments 
- Return type
- iterable 
 - Examples - >>> get_default_topk_args(num_classes=4) [1, 3] - >>> get_default_topk_args(num_classes=8) [1, 3, 5] 
- 
catalyst.metrics.functional.wrap_topk_metric2dict(metric_fn: Callable, topk_args: Sequence[int]) → Callable[source]¶
- Logging wrapper for metrics with Sequence[Union[torch.Tensor, int, float, Dict]] output. Computes the metric and sync each element from the output sequence with passed topk argument. - Parameters
- metric_fn – metric function to compute 
- topk_args – topk args to sync outputs with 
 
- Returns
- wrapped metric function with List[Dict] output 
- Raises
- NotImplementedError – if metrics returned values are out of torch.Tensor, int, float, Dict union. 
 
- 
catalyst.metrics.functional.wrap_class_metric2dict(metric_fn: Callable, class_args: Sequence[str] = None) → Callable[source]¶
- # noqa: D202 Logging wrapper for metrics with torch.Tensor output and [num_classes] shape. Computes the metric and sync each element from the output Tensor with passed class argument. - Parameters
- metric_fn – metric function to compute 
- class_args – class names for logging. default: None - class indexes will be used. 
 
- Returns
- wrapped metric function with List[Dict] output