Metrics¶
Metric API¶
IMetric¶
-
class
catalyst.metrics._metric.
IMetric
(compute_on_call: bool = True)[source]¶ Bases:
abc.ABC
Interface for all Metrics.
- Parameters
compute_on_call – Computes and returns metric value during metric call. Used for per-batch logging. default:
True
-
abstract
compute
() → Any[source]¶ Computes the metric based on it’s accumulated state.
By default, this is called at the end of each loader (on_loader_end event).
- Returns
computed value, # noqa: DAR202 it’s better to return key-value
- Return type
Any
ICallbackBatchMetric¶
-
class
catalyst.metrics._metric.
ICallbackBatchMetric
(compute_on_call: bool = True, prefix: str = None, suffix: str = None)[source]¶ Bases:
catalyst.metrics._metric.IMetric
Interface for all batch-based Metrics.
-
abstract
compute_key_value
() → Dict[str, float][source]¶ Computes the metric based on it’s accumulated state.
By default, this is called at the end of each loader (on_loader_end event).
- Returns
computed value in key-value format. # noqa: DAR202
- Return type
Dict
-
abstract
update_key_value
(*args, **kwargs) → Dict[str, float][source]¶ Updates the metric based with new input.
By default, this is called at the end of each loader (on_loader_end event).
- Parameters
*args – some args
**kwargs – some kwargs
- Returns
computed value in key-value format. # noqa: DAR202
- Return type
Dict
-
abstract
ICallbackLoaderMetric¶
-
class
catalyst.metrics._metric.
ICallbackLoaderMetric
(compute_on_call: bool = True, prefix: str = None, suffix: str = None)[source]¶ Bases:
catalyst.metrics._metric.IMetric
Interface for all loader-based Metrics.
- Parameters
compute_on_call – Computes and returns metric value during metric call. Used for per-batch logging. default:
True
prefix – metrics prefix
suffix – metrics suffix
-
abstract
compute_key_value
() → Dict[str, float][source]¶ Computes the metric based on it’s accumulated state.
By default, this is called at the end of each loader (on_loader_end event).
- Returns
computed value in key-value format. # noqa: DAR202
- Return type
Dict
General Metrics¶
AccumulativeMetric¶
-
class
catalyst.metrics._accumulative.
AccumulativeMetric
(keys: Iterable[str] = None, compute_on_call: bool = True, prefix: Optional[str] = None, suffix: Optional[str] = None)[source]¶ Bases:
catalyst.metrics._metric.ICallbackLoaderMetric
This metric accumulates all the input data along loader
- Parameters
keys – list of keys to accumulate data from batch
compute_on_call – if True, allows compute metric’s value on call
prefix – metric prefix
suffix – metric suffix
-
compute
() → Dict[str, torch.Tensor][source]¶ Return accumulated data
- Returns
dict of accumulated data
-
compute_key_value
() → Dict[str, torch.Tensor][source]¶ Return accumulated data
- Returns
dict of accumulated data
AdditiveMetric¶
-
class
catalyst.metrics._additive.
AdditiveMetric
(compute_on_call: bool = True, mode: str = 'numpy')[source]¶ Bases:
catalyst.metrics._metric.IMetric
This metric computes mean and std values of input data.
- Parameters
compute_on_call – if True, computes and returns metric value during metric call
mode – expected dtype returned by the metric,
"numpy"
or"torch"
- Raises
ValueError – if mode is not supported
Examples:
import numpy as np from catalyst import metrics values = [1, 2, 3, 4, 5] num_samples_list = [1, 2, 3, 4, 5] true_values = [1, 1.666667, 2.333333, 3, 3.666667] metric = metrics.AdditiveMetric() for value, num_samples, true_value in zip(values, num_samples_list, true_values): metric.update(value=value, num_samples=num_samples) mean, _ = metric.compute() assert np.isclose(mean, true_value)
import os from torch import nn, optim from torch.nn import functional as F from torch.utils.data import DataLoader from catalyst import dl, metrics from catalyst.data import ToTensor from catalyst.contrib.datasets import MNIST model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10)) optimizer = optim.Adam(model.parameters(), lr=0.02) loaders = { "train": DataLoader( MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32 ), "valid": DataLoader( MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32 ), } class CustomRunner(dl.Runner): def predict_batch(self, batch): # model inference step return self.model(batch[0].to(self.device)) def on_loader_start(self, runner): super().on_loader_start(runner) self.meters = { key: metrics.AdditiveMetric(compute_on_call=False) for key in ["loss", "accuracy01", "accuracy03"] } def handle_batch(self, batch): # model train/valid step # unpack the batch x, y = batch # run model forward pass logits = self.model(x) # compute the loss loss = F.cross_entropy(logits, y) # compute other metrics of interest accuracy01, accuracy03 = metrics.accuracy(logits, y, topk=(1, 3)) # log metrics self.batch_metrics.update( {"loss": loss, "accuracy01": accuracy01, "accuracy03": accuracy03} ) for key in ["loss", "accuracy01", "accuracy03"]: self.meters[key].update(self.batch_metrics[key].item(), self.batch_size) # run model backward pass if self.is_train_loader: loss.backward() self.optimizer.step() self.optimizer.zero_grad() def on_loader_end(self, runner): for key in ["loss", "accuracy01", "accuracy03"]: self.loader_metrics[key] = self.meters[key].compute()[0] super().on_loader_end(runner) runner = CustomRunner() # model training runner.train( model=model, optimizer=optimizer, loaders=loaders, logdir="./logs", num_epochs=5, verbose=True, valid_loader="valid", valid_metric="loss", minimize_valid_metric=True, )
Note
Please follow the minimal examples sections for more use cases.
ConfusionMatrixMetric¶
-
class
catalyst.metrics._confusion_matrix.
ConfusionMatrixMetric
(num_classes: int, normalized: bool = False, compute_on_call: bool = True)[source]¶ Bases:
catalyst.metrics._metric.IMetric
Constructs a confusion matrix for a multiclass classification problems.
- Parameters
num_classes – number of classes in the classification problem
normalized – determines whether or not the confusion matrix is normalized or not
compute_on_call – Boolean flag to computes and return confusion matrix during __call__. default: True
Examples:
import torch from torch.utils.data import DataLoader, TensorDataset from catalyst import dl # sample data num_samples, num_features, num_classes = int(1e4), int(1e1), 4 X = torch.rand(num_samples, num_features) y = (torch.rand(num_samples,) * num_classes).to(torch.int64) # pytorch loaders dataset = TensorDataset(X, y) loader = DataLoader(dataset, batch_size=32, num_workers=1) loaders = {"train": loader, "valid": loader} # model, criterion, optimizer, scheduler model = torch.nn.Linear(num_features, num_classes) criterion = torch.nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters()) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2]) # model training runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) runner.train( model=model, criterion=criterion, optimizer=optimizer, scheduler=scheduler, loaders=loaders, logdir="./logdir", num_epochs=3, valid_loader="valid", valid_metric="accuracy03", minimize_valid_metric=False, verbose=True, callbacks=[ dl.AccuracyCallback( input_key="logits", target_key="targets", num_classes=num_classes ), dl.PrecisionRecallF1SupportCallback( input_key="logits", target_key="targets", num_classes=num_classes ), dl.AUCCallback(input_key="logits", target_key="targets"), dl.ConfusionMatrixCallback( input_key="logits", target_key="targets", num_classes=num_classes ), ], )
Note
Please follow the minimal examples sections for more use cases.
FunctionalBatchMetric¶
-
class
catalyst.metrics._functional_metric.
FunctionalBatchMetric
(metric_fn: Callable, metric_key: str, compute_on_call: bool = True, prefix: str = None, suffix: str = None)[source]¶ Bases:
catalyst.metrics._metric.ICallbackBatchMetric
Class for custom batch-based metrics in a functional way.
- Parameters
metric_fn – metric function, that get outputs, targets and return score as torch.Tensor
metric_key – metric name
compute_on_call – Computes and returns metric value during metric call. Used for per-batch logging. default: True
prefix – metric prefix
suffix – metric suffix
Note
Loader metrics calculated as average over all batch metrics.
Examples:
import torch from catalyst import metrics import sklearn.metrics outputs = torch.tensor([1, 0, 2, 1]) targets = torch.tensor([3, 0, 2, 2]) metric = metrics.FunctionalBatchMetric( metric_fn=sklearn.metrics.accuracy_score, metric_key="sk_accuracy", ) metric.reset() metric.update(batch_size=len(outputs), y_pred=outputs, y_true=targets) metric.compute() # (0.5, 0.0) # mean, std metric.compute_key_value() # {'sk_accuracy': 0.5, 'sk_accuracy/mean': 0.5, 'sk_accuracy/std': 0.0}
FunctionalLoaderMetric¶
-
class
catalyst.metrics._functional_metric.
FunctionalLoaderMetric
(metric_fn: Callable, metric_key: str, accumulative_fields: Iterable[str] = None, compute_on_call: bool = True, prefix: str = None, suffix: str = None)[source]¶ Bases:
catalyst.metrics._metric.ICallbackLoaderMetric
Class for custom loader-based metrics in a functional way.
- Parameters
metric_fn – metric function, that get outputs, targets and return score as torch.Tensor
metric_key – metric name
accumulative_fields – list of keys to accumulate data from batch
compute_on_call – if True, allows compute metric’s value on call
prefix – metric prefix
suffix – metric suffix
Note
Metrics are calculated over all samples.
Examples:
from functools import partial import torch from catalyst import metrics import sklearn.metrics targets = torch.tensor([3, 0, 2, 2, 1]) outputs = torch.rand((len(targets), targets.max()+1)).softmax(1) metric = metrics.FunctionalLoaderMetric( metric_fn=partial( sklearn.metrics.roc_auc_score, average="macro", multi_class="ovr" ), metric_key="sk_auc", accumulative_fields=['y_score','y_true'], ) metric.reset(len(outputs), len(outputs)) metric.update(y_score=outputs, y_true=targets) metric.compute() # ... metric.compute_key_value() # {'sk_auc': ...}
Runner Metrics¶
Accuracy - AccuracyMetric¶
-
class
catalyst.metrics._accuracy.
AccuracyMetric
(topk_args: List[int] = None, num_classes: int = None, compute_on_call: bool = True, prefix: str = None, suffix: str = None)[source]¶ Bases:
catalyst.metrics._metric.ICallbackBatchMetric
This metric computes accuracy for multiclass classification case. It computes mean value of accuracy and it’s approximate std value (note that it’s not a real accuracy std but std of accuracy over batch mean values).
- Parameters
topk_args – list of topk for accuracy@topk computing
num_classes – number of classes
compute_on_call – if True, computes and returns metric value during metric call
prefix – metric prefix
suffix – metric suffix
Examples:
import torch from catalyst import metrics outputs = torch.tensor([ [0.2, 0.5, 0.0, 0.3], [0.9, 0.1, 0.0, 0.0], [0.0, 0.1, 0.6, 0.3], [0.0, 0.8, 0.2, 0.0], ]) targets = torch.tensor([3, 0, 2, 2]) metric = metrics.AccuracyMetric(topk_args=(1, 3)) metric.reset() metric.update(outputs, targets) metric.compute() # ( # (0.5, 1.0), # top1, top3 mean # (0.0, 0.0), # top1, top3 std # ) metric.compute_key_value() # { # 'accuracy': 0.5, # 'accuracy/std': 0.0, # 'accuracy01': 0.5, # 'accuracy01/std': 0.0, # 'accuracy03': 1.0, # 'accuracy03/std': 0.0, # } metric.reset() metric(outputs, targets) # ( # (0.5, 1.0), # top1, top3 mean # (0.0, 0.0), # top1, top3 std # )
import torch from torch.utils.data import DataLoader, TensorDataset from catalyst import dl # sample data num_samples, num_features, num_classes = int(1e4), int(1e1), 4 X = torch.rand(num_samples, num_features) y = (torch.rand(num_samples,) * num_classes).to(torch.int64) # pytorch loaders dataset = TensorDataset(X, y) loader = DataLoader(dataset, batch_size=32, num_workers=1) loaders = {"train": loader, "valid": loader} # model, criterion, optimizer, scheduler model = torch.nn.Linear(num_features, num_classes) criterion = torch.nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters()) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2]) # model training runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) runner.train( model=model, criterion=criterion, optimizer=optimizer, scheduler=scheduler, loaders=loaders, logdir="./logdir", num_epochs=3, valid_loader="valid", valid_metric="accuracy03", minimize_valid_metric=False, verbose=True, callbacks=[ dl.AccuracyCallback( input_key="logits", target_key="targets", num_classes=num_classes ), dl.PrecisionRecallF1SupportCallback( input_key="logits", target_key="targets", num_classes=num_classes ), dl.AUCCallback(input_key="logits", target_key="targets"), ], )
Note
Please follow the minimal examples sections for more use cases.
Accuracy - MultilabelAccuracyMetric¶
-
class
catalyst.metrics._accuracy.
MultilabelAccuracyMetric
(threshold: Union[float, torch.Tensor] = 0.5, compute_on_call: bool = True, prefix: Optional[str] = None, suffix: Optional[str] = None)[source]¶ Bases:
catalyst.metrics._additive.AdditiveMetric
,catalyst.metrics._metric.ICallbackBatchMetric
This metric computes accuracy for multilabel classification case. It computes mean value of accuracy and it’s approximate std value (note that it’s not a real accuracy std but std of accuracy over batch mean values).
- Parameters
compute_on_call – if True, computes and returns metric value during metric call
prefix – metric prefix
suffix – metric suffix
threshold – thresholds for model scores
Examples:
import torch from catalyst import metrics outputs = torch.tensor([ [0.1, 0.9, 0.0, 0.8], [0.96, 0.01, 0.85, 0.2], [0.98, 0.4, 0.2, 0.1], [0.1, 0.89, 0.2, 0.0], ]) targets = torch.tensor([ [0, 1, 1, 0], [1, 0, 1, 0], [0, 1, 0, 0], [0, 1, 0, 0], ]) metric = metrics.MultilabelAccuracyMetric(threshold=0.6) metric.reset() metric.update(outputs, targets) metric.compute() # (0.75, 0.0) # mean, std metric.compute_key_value() # { # 'accuracy': 0.75, # 'accuracy/std': 0.0, # } metric.reset() metric(outputs, targets) # (0.75, 0.0) # mean, std
import torch from torch.utils.data import DataLoader, TensorDataset from catalyst import dl # sample data num_samples, num_features, num_classes = int(1e4), int(1e1), 4 X = torch.rand(num_samples, num_features) y = (torch.rand(num_samples, num_classes) > 0.5).to(torch.float32) # pytorch loaders dataset = TensorDataset(X, y) loader = DataLoader(dataset, batch_size=32, num_workers=1) loaders = {"train": loader, "valid": loader} # model, criterion, optimizer, scheduler model = torch.nn.Linear(num_features, num_classes) criterion = torch.nn.BCEWithLogitsLoss() optimizer = torch.optim.Adam(model.parameters()) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2]) # model training runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) runner.train( model=model, criterion=criterion, optimizer=optimizer, scheduler=scheduler, loaders=loaders, logdir="./logdir", num_epochs=3, valid_loader="valid", valid_metric="accuracy", minimize_valid_metric=False, verbose=True, callbacks=[ dl.AUCCallback(input_key="logits", target_key="targets"), dl.MultilabelAccuracyCallback( input_key="logits", target_key="targets", threshold=0.5 ) ] )
Note
Please follow the minimal examples sections for more use cases.
AUCMetric¶
-
class
catalyst.metrics._auc.
AUCMetric
(compute_on_call: bool = True, prefix: str = None, suffix: str = None)[source]¶ Bases:
catalyst.metrics._metric.ICallbackLoaderMetric
AUC metric,
- Parameters
compute_on_call – if True, computes and returns metric value during metric call
prefix – metric prefix
suffix – metric suffix
Warning
This metric is under API improvement.
Examples:
import torch from catalyst import metrics scores = torch.tensor([ [0.9, 0.1], [0.1, 0.9], ]) targets = torch.tensor([ [1, 0], [0, 1], ]) metric = metrics.AUCMetric() # for efficient statistics storage metric.reset(num_batches=1, num_samples=len(scores)) metric.update(scores, targets) metric.compute() # ( # tensor([1., 1.]) # per class # 1.0, # micro # 1.0, # macro # 1.0 # weighted # ) metric.compute_key_value() # { # 'auc': 1.0, # 'auc/_micro': 1.0, # 'auc/_macro': 1.0, # 'auc/_weighted': 1.0 # 'auc/class_00': 1.0, # 'auc/class_01': 1.0, # } metric.reset(num_batches=1, num_samples=len(scores)) metric(scores, targets) # ( # tensor([1., 1.]) # per class # 1.0, # micro # 1.0, # macro # 1.0 # weighted # )
import torch from torch.utils.data import DataLoader, TensorDataset from catalyst import dl # sample data num_samples, num_features, num_classes = int(1e4), int(1e1), 4 X = torch.rand(num_samples, num_features) y = (torch.rand(num_samples,) * num_classes).to(torch.int64) # pytorch loaders dataset = TensorDataset(X, y) loader = DataLoader(dataset, batch_size=32, num_workers=1) loaders = {"train": loader, "valid": loader} # model, criterion, optimizer, scheduler model = torch.nn.Linear(num_features, num_classes) criterion = torch.nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters()) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2]) # model training runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) runner.train( model=model, criterion=criterion, optimizer=optimizer, scheduler=scheduler, loaders=loaders, logdir="./logdir", num_epochs=3, valid_loader="valid", valid_metric="accuracy03", minimize_valid_metric=False, verbose=True, callbacks=[ dl.AccuracyCallback( input_key="logits", target_key="targets", num_classes=num_classes ), dl.PrecisionRecallF1SupportCallback( input_key="logits", target_key="targets", num_classes=num_classes ), dl.AUCCallback(input_key="logits", target_key="targets"), ], )
Note
Please follow the minimal examples sections for more use cases.
Classification – BinaryPrecisionRecallF1Metric¶
-
class
catalyst.metrics._classification.
BinaryPrecisionRecallF1Metric
(zero_division: int = 0, compute_on_call: bool = True, prefix: Optional[str] = None, suffix: Optional[str] = None)[source]¶ Bases:
catalyst.metrics._classification.StatisticsMetric
Precision, recall, f1_score and support metrics for binary classification.
- Parameters
zero_division – value to set in case of zero division during metrics (precision, recall) computation; should be one of 0 or 1
compute_on_call – if True, allows compute metric’s value on call
prefix – metric prefix
suffix – metric suffix
Classification – MulticlassPrecisionRecallF1SupportMetric¶
-
class
catalyst.metrics._classification.
MulticlassPrecisionRecallF1SupportMetric
(num_classes: int = None, zero_division: int = 0, compute_on_call: bool = True, prefix: Optional[str] = None, suffix: Optional[str] = None)[source]¶ Bases:
catalyst.metrics._classification.PrecisionRecallF1SupportMetric
Precision, recall, f1_score and support metrics for multiclass classification. Counts metrics with macro, micro and weighted average.
- Parameters
num_classes – number of classes in loader’s dataset
zero_division – value to set in case of zero division during metrics (precision, recall) computation; should be one of 0 or 1
compute_on_call – if True, allows compute metric’s value on call
prefix – metric prefix
suffix – metric suffix
Examples:
import torch from catalyst import metrics num_classes = 4 zero_division = 0 outputs_list = [torch.tensor([0, 1, 2]), torch.tensor([2, 3]), torch.tensor([0, 1, 3])] targets_list = [torch.tensor([0, 1, 1]), torch.tensor([2, 3]), torch.tensor([0, 1, 2])] metric = metrics.MulticlassPrecisionRecallF1SupportMetric( num_classes=num_classes, zero_division=zero_division ) metric.reset() for outputs, targets in zip(outputs_list, targets_list): metric.update(outputs=outputs, targets=targets) metric.compute() # ( # # per class precision, recall, f1, support # ( # array([1. , 1. , 0.5, 0.5]), # array([1. , 0.66666667, 0.5 , 1. ]), # array([0.999995 , 0.7999952 , 0.499995 , 0.66666222]), # array([2., 3., 2., 1.]), # ), # # micro precision, recall, f1, support # (0.75, 0.75, 0.7499950000333331, None), # # macro precision, recall, f1, support # (0.75, 0.7916666666666667, 0.7416618555889127, None), # # weighted precision, recall, f1, support # (0.8125, 0.75, 0.7583284778110313, None) # ) metric.compute_key_value() # { # 'f1/_macro': 0.7416618555889127, # 'f1/_micro': 0.7499950000333331, # 'f1/_weighted': 0.7583284778110313, # 'f1/class_00': 0.9999950000249999, # 'f1/class_01': 0.7999952000287999, # 'f1/class_02': 0.49999500004999947, # 'f1/class_03': 0.6666622222518517, # 'precision/_macro': 0.75, # 'precision/_micro': 0.75, # 'precision/_weighted': 0.8125, # 'precision/class_00': 1.0, # 'precision/class_01': 1.0, # 'precision/class_02': 0.5, # 'precision/class_03': 0.5, # 'recall/_macro': 0.7916666666666667, # 'recall/_micro': 0.75, # 'recall/_weighted': 0.75, # 'recall/class_00': 1.0, # 'recall/class_01': 0.6666666666666667, # 'recall/class_02': 0.5, # 'recall/class_03': 1.0, # 'support/class_00': 2.0, # 'support/class_01': 3.0, # 'support/class_02': 2.0, # 'support/class_03': 1.0 # } metric.reset() metric(outputs_list[0], targets_list[0]) # ( # # per class precision, recall, f1, support # ( # array([1., 1., 0., 0.]), # array([1. , 0.5, 0. , 0. ]), # array([0.999995 , 0.66666222, 0. , 0. ]), # array([1., 2., 0., 0.]), # ), # # micro precision, recall, f1, support # (0.6666666666666667, 0.6666666666666667, 0.6666616667041664, None), # # macro precision, recall, f1, support # (0.5, 0.375, 0.41666430556921286, None), # # weighted precision, recall, f1, support # (1.0, 0.6666666666666666, 0.7777731481762343, None) # )
import torch from torch.utils.data import DataLoader, TensorDataset from catalyst import dl # sample data num_samples, num_features, num_classes = int(1e4), int(1e1), 4 X = torch.rand(num_samples, num_features) y = (torch.rand(num_samples,) * num_classes).to(torch.int64) # pytorch loaders dataset = TensorDataset(X, y) loader = DataLoader(dataset, batch_size=32, num_workers=1) loaders = {"train": loader, "valid": loader} # model, criterion, optimizer, scheduler model = torch.nn.Linear(num_features, num_classes) criterion = torch.nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters()) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2]) # model training runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) runner.train( model=model, criterion=criterion, optimizer=optimizer, scheduler=scheduler, loaders=loaders, logdir="./logdir", num_epochs=3, valid_loader="valid", valid_metric="accuracy03", minimize_valid_metric=False, verbose=True, callbacks=[ dl.AccuracyCallback( input_key="logits", target_key="targets", num_classes=num_classes ), dl.PrecisionRecallF1SupportCallback( input_key="logits", target_key="targets", num_classes=num_classes ), dl.AUCCallback(input_key="logits", target_key="targets"), ], )
Note
Please follow the minimal examples sections for more use cases.
Classification – MultilabelPrecisionRecallF1SupportMetric¶
-
class
catalyst.metrics._classification.
MultilabelPrecisionRecallF1SupportMetric
(num_classes: int = None, zero_division: int = 0, compute_on_call: bool = True, prefix: Optional[str] = None, suffix: Optional[str] = None)[source]¶ Bases:
catalyst.metrics._classification.PrecisionRecallF1SupportMetric
Precision, recall, f1_score and support metrics for multilabel classification. Counts metrics with macro, micro and weighted average.
- Parameters
num_classes – number of classes in loader’s dataset
zero_division – value to set in case of zero division during metrics (precision, recall) computation; should be one of 0 or 1
compute_on_call – if True, allows compute metric’s value on call
prefix – metric prefix
suffix – metric suffix
Examples:
import torch from catalyst import metrics num_classes = 4 zero_division = 0 outputs_list = [ torch.tensor([[0, 1, 0, 1], [0, 0, 0, 0], [0, 1, 1, 0]]), torch.tensor([[0, 1, 1, 1], [0, 0, 0, 1], [0, 1, 0, 1]]), torch.tensor([[0, 1, 0, 0], [0, 1, 0, 1]]), ] targets_list = [ torch.tensor([[0, 1, 1, 1], [0, 0, 0, 0], [0, 1, 0, 1]]), torch.tensor([[0, 1, 0, 0], [0, 0, 1, 1], [1, 0, 1, 0]]), torch.tensor([[0, 1, 0, 0], [0, 0, 1, 0]]), ] metric = metrics.MultilabelPrecisionRecallF1SupportMetric( num_classes=num_classes, zero_division=zero_division ) metric.reset() for outputs, targets in zip(outputs_list, targets_list): metric.update(outputs=outputs, targets=targets) metric.compute() # ( # # per class precision, recall, f1, support # ( # array([0. , 0.66666667, 0. , 0.4 ]), # array([0. , 1. , 0. , 0.66666667]), # array([0. , 0.7999952 , 0. , 0.49999531]), # array([1., 4., 4., 3.]) # ), # # micro precision, recall, f1, support # (0.46153846153846156, 0.5, 0.4799950080519163, None), # # macro precision, recall, f1, support # (0.2666666666666667, 0.4166666666666667, 0.32499762814318617, None), # # weighted precision, recall, f1, support # (0.32222222222222224, 0.5, 0.39166389481225283, None) # ) metric.compute_key_value() # { # 'f1/_macro': 0.32499762814318617, # 'f1/_micro': 0.4799950080519163, # 'f1/_weighted': 0.39166389481225283, # 'f1/class_00': 0.0, # 'f1/class_01': 0.7999952000287999, # 'f1/class_02': 0.0, # 'f1/class_03': 0.49999531254394486, # 'precision/_macro': 0.2666666666666667, # 'precision/_micro': 0.46153846153846156, # 'precision/_weighted': 0.32222222222222224, # 'precision/class_00': 0.0, # 'precision/class_01': 0.6666666666666667, # 'precision/class_02': 0.0, # 'precision/class_03': 0.4, # 'recall/_macro': 0.4166666666666667, # 'recall/_micro': 0.5, # 'recall/_weighted': 0.5, # 'recall/class_00': 0.0, # 'recall/class_01': 1.0, # 'recall/class_02': 0.0, # 'recall/class_03': 0.6666666666666667, # 'support/class_00': 1.0, # 'support/class_01': 4.0, # 'support/class_02': 4.0, # 'support/class_03': 3.0 # } metric.reset() metric(outputs_list[0], targets_list[0]) # ( # # per class precision, recall, f1, support # ( # array([0., 1., 0., 1.]), # array([0. , 1. , 0. , 0.5]), # array([0. , 0.999995 , 0. , 0.66666222]), # array([0., 2., 1., 2.]) # ), # # micro precision, recall, f1, support # (0.75, 0.6, 0.6666617284316411, None), # # macro precision, recall, f1, support # (0.5, 0.375, 0.41666430556921286, None), # # weighted precision, recall, f1, support # (0.8, 0.6000000000000001, 0.6666628889107407, None) # )
import torch from torch.utils.data import DataLoader, TensorDataset from catalyst import dl # sample data num_samples, num_features, num_classes = int(1e4), int(1e1), 4 X = torch.rand(num_samples, num_features) y = (torch.rand(num_samples, num_classes) > 0.5).to(torch.float32) # pytorch loaders dataset = TensorDataset(X, y) loader = DataLoader(dataset, batch_size=32, num_workers=1) loaders = {"train": loader, "valid": loader} # model, criterion, optimizer, scheduler model = torch.nn.Linear(num_features, num_classes) criterion = torch.nn.BCEWithLogitsLoss() optimizer = torch.optim.Adam(model.parameters()) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2]) # model training runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) runner.train( model=model, criterion=criterion, optimizer=optimizer, scheduler=scheduler, loaders=loaders, logdir="./logdir", num_epochs=3, valid_loader="valid", valid_metric="accuracy", minimize_valid_metric=False, verbose=True, callbacks=[ dl.BatchTransformCallback( transform=torch.sigmoid, scope="on_batch_end", input_key="logits", output_key="scores" ), dl.AUCCallback(input_key="scores", target_key="targets"), dl.MultilabelAccuracyCallback( input_key="scores", target_key="targets", threshold=0.5 ), dl.MultilabelPrecisionRecallF1SupportCallback( input_key="scores", target_key="targets", threshold=0.5 ), ] )
Note
Please follow the minimal examples sections for more use cases.
CMCMetric¶
-
class
catalyst.metrics._cmc_score.
CMCMetric
(embeddings_key: str, labels_key: str, is_query_key: str, topk_args: Iterable[int] = None, compute_on_call: bool = True, prefix: Optional[str] = None, suffix: Optional[str] = None)[source]¶ Bases:
catalyst.metrics._accumulative.AccumulativeMetric
Cumulative Matching Characteristics
- Parameters
embeddings_key – key of embedding tensor in batch
labels_key – key of label tensor in batch
is_query_key – key of query flag tensor in batch
topk_args – list of k, specifies which cmc@k should be calculated
compute_on_call – if True, allows compute metric’s value on call
prefix – metric prefix
suffix – metric suffix
Examples:
import torch from catalyst import metrics batch = { "embeddings": torch.tensor( [ [1, 1, 0, 0], [1, 0, 1, 1], [0, 1, 1, 1], [0, 0, 1, 1], [1, 1, 1, 0], [1, 1, 1, 1], [0, 1, 1, 0], ] ).float(), "labels": torch.tensor([0, 0, 1, 1, 0, 1, 1]), "is_query": torch.tensor([1, 1, 1, 1, 0, 0, 0]).bool(), } topk = (1, 3) metric = metrics.CMCMetric( embeddings_key="embeddings", labels_key="labels", is_query_key="is_query", topk_args=topk, ) metric.reset(num_batches=1, num_samples=len(batch["embeddings"])) metric.update(**batch) metric.compute() # [0.75, 1.0] # CMC@01, CMC@03 metric.compute_key_value() # {'cmc01': 0.75, 'cmc03': 1.0}
import os from torch.optim import Adam from torch.utils.data import DataLoader from catalyst import data, dl from catalyst.contrib import datasets, models, nn from catalyst.data.transforms import Compose, Normalize, ToTensor # 1. train and valid loaders transforms = Compose([ToTensor(), Normalize((0.1307,), (0.3081,))]) train_dataset = datasets.MnistMLDataset( root=os.getcwd(), download=True, transform=transforms ) sampler = data.BalanceBatchSampler(labels=train_dataset.get_labels(), p=5, k=10) train_loader = DataLoader( dataset=train_dataset, sampler=sampler, batch_size=sampler.batch_size ) valid_dataset = datasets.MnistQGDataset( root=os.getcwd(), transform=transforms, gallery_fraq=0.2 ) valid_loader = DataLoader(dataset=valid_dataset, batch_size=1024) # 2. model and optimizer model = models.MnistSimpleNet(out_features=16) optimizer = Adam(model.parameters(), lr=0.001) # 3. criterion with triplets sampling sampler_inbatch = data.HardTripletsSampler(norm_required=False) criterion = nn.TripletMarginLossWithSampler(margin=0.5, sampler_inbatch=sampler_inbatch) # 4. training with catalyst Runner class CustomRunner(dl.SupervisedRunner): def handle_batch(self, batch) -> None: if self.is_train_loader: images, targets = batch["features"].float(), batch["targets"].long() features = self.model(images) self.batch = {"embeddings": features, "targets": targets,} else: images, targets, is_query = ( batch["features"].float(), batch["targets"].long(), batch["is_query"].bool() ) features = self.model(images) self.batch = { "embeddings": features, "targets": targets, "is_query": is_query } callbacks = [ dl.ControlFlowCallback( dl.CriterionCallback( input_key="embeddings", target_key="targets", metric_key="loss" ), loaders="train", ), dl.ControlFlowCallback( dl.CMCScoreCallback( embeddings_key="embeddings", labels_key="targets", is_query_key="is_query", topk_args=[1], ), loaders="valid", ), dl.PeriodicLoaderCallback( valid_loader_key="valid", valid_metric_key="cmc01", minimize=False, valid=2 ), ] runner = CustomRunner(input_key="features", output_key="embeddings") runner.train( model=model, criterion=criterion, optimizer=optimizer, callbacks=callbacks, loaders={"train": train_loader, "valid": valid_loader}, verbose=False, logdir="./logs", valid_loader="valid", valid_metric="cmc01", minimize_valid_metric=False, num_epochs=10, )
Note
Please follow the minimal examples sections for more use cases.
ReidCMCMetric¶
-
class
catalyst.metrics._cmc_score.
ReidCMCMetric
(embeddings_key: str, pids_key: str, cids_key: str, is_query_key: str, topk_args: Iterable[int] = None, compute_on_call: bool = True, prefix: Optional[str] = None, suffix: Optional[str] = None)[source]¶ Bases:
catalyst.metrics._accumulative.AccumulativeMetric
Cumulative Matching Characteristics for Reid case
- Parameters
embeddings_key – key of embedding tensor in batch
pids_key – key of pids tensor in batch
cids_key – key of cids tensor in batch
is_query_key – key of query flag tensor in batch
topk_args – list of k, specifies which cmc@k should be calculated
compute_on_call – if True, allows compute metric’s value on call
prefix – metric prefix
suffix – metric suffix
Examples:
import torch from catalyst.metrics import ReidCMCMetric batch = { "embeddings": torch.tensor( [ [1, 1, 0, 0], [1, 0, 0, 0], [0, 1, 1, 1], [0, 0, 1, 1], [1, 1, 1, 0], [1, 1, 1, 1], [0, 1, 1, 0], ] ).float(), "pids": torch.Tensor([0, 0, 1, 1, 0, 1, 1]).long(), "cids": torch.Tensor([0, 1, 1, 2, 0, 1, 3]).long(), "is_query": torch.Tensor([1, 1, 1, 1, 0, 0, 0]).bool(), } topk = (1, 3) metric = ReidCMCMetric( embeddings_key="embeddings", pids_key="pids", cids_key="cids", is_query_key="is_query", topk_args=topk, ) metric.reset(num_batches=1, num_samples=len(batch["embeddings"])) metric.update(**batch) metric.compute() # [0.75, 1.0] # CMC@01, CMC@03 metric.compute_key_value() # {'cmc01': 0.75, 'cmc03': 1.0}
RecSys – HitrateMetric¶
-
class
catalyst.metrics._hitrate.
HitrateMetric
(topk_args: List[int] = None, compute_on_call: bool = True, prefix: str = None, suffix: str = None)[source]¶ Bases:
catalyst.metrics._metric.ICallbackBatchMetric
Calculates the hitrate.
- Parameters
topk_args – list of topk for hitrate@topk computing
compute_on_call – if True, computes and returns metric value during metric call
prefix – metric prefix
suffix – metric suffix
Compute mean value of hitrate and it’s approximate std value.
Examples:
import torch from catalyst import metrics outputs = torch.Tensor([[4.0, 2.0, 3.0, 1.0], [1.0, 2.0, 3.0, 4.0]]) targets = torch.Tensor([[0, 0, 1.0, 1.0], [0, 0, 0.0, 0.0]]) metric = metrics.HitrateMetric(topk_args=[1, 2, 3, 4]) metric.reset() metric.update(outputs, targets) metric.compute() # ( # (0.0, 0.25, 0.25, 0.5), # mean for @01, @02, @03, @04 # (0.0, 0.0, 0.0, 0.0) # std for @01, @02, @03, @04 # ) metric.compute_key_value() # { # 'hitrate': 0.0, # 'hitrate/std': 0.0, # 'hitrate01': 0.0, # 'hitrate01/std': 0.0, # 'hitrate02': 0.25, # 'hitrate02/std': 0.0, # 'hitrate03': 0.25, # 'hitrate03/std': 0.0, # 'hitrate04': 0.5, # 'hitrate04/std': 0.0 # } metric.reset() metric(outputs, targets) # ( # (0.0, 0.25, 0.25, 0.5), # mean for @01, @02, @03, @04 # (0.0, 0.0, 0.0, 0.0) # std for @01, @02, @03, @04 # )
import torch from torch.utils.data import DataLoader, TensorDataset from catalyst import dl # sample data num_users, num_features, num_items = int(1e4), int(1e1), 10 X = torch.rand(num_users, num_features) y = (torch.rand(num_users, num_items) > 0.5).to(torch.float32) # pytorch loaders dataset = TensorDataset(X, y) loader = DataLoader(dataset, batch_size=32, num_workers=1) loaders = {"train": loader, "valid": loader} # model, criterion, optimizer, scheduler model = torch.nn.Linear(num_features, num_items) criterion = torch.nn.BCEWithLogitsLoss() optimizer = torch.optim.Adam(model.parameters()) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2]) # model training runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) runner.train( model=model, criterion=criterion, optimizer=optimizer, scheduler=scheduler, loaders=loaders, num_epochs=3, verbose=True, callbacks=[ dl.BatchTransformCallback( transform=torch.sigmoid, scope="on_batch_end", input_key="logits", output_key="scores" ), dl.CriterionCallback( input_key="logits", target_key="targets", metric_key="loss" ), dl.AUCCallback(input_key="scores", target_key="targets"), dl.HitrateCallback( input_key="scores", target_key="targets", topk_args=(1, 3, 5) ), dl.MRRCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.MAPCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.NDCGCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.OptimizerCallback(metric_key="loss"), dl.SchedulerCallback(), dl.CheckpointCallback( logdir="./logs", loader_key="valid", metric_key="loss", minimize=True ), ] )
Note
Please follow the minimal examples sections for more use cases.
RecSys – MAPMetric¶
-
class
catalyst.metrics._map.
MAPMetric
(topk_args: List[int] = None, compute_on_call: bool = True, prefix: str = None, suffix: str = None)[source]¶ Bases:
catalyst.metrics._metric.ICallbackBatchMetric
Calculates the Mean Average Precision (MAP) for RecSys. The precision metric summarizes the fraction of relevant items out of the whole the recommendation list. Computes mean value of MAP and it’s approximate std value
- Parameters
topk_args – list of topk for map@topk computing
compute_on_call – if True, computes and returns metric value during metric call
prefix – metric prefix
suffix – metric suffix
Examples:
import torch from catalyst import metrics outputs = torch.tensor([ [9, 8, 7, 6, 5, 4, 3, 2, 1, 0], [9, 8, 7, 6, 5, 4, 3, 2, 1, 0], ]) targets = torch.tensor([ [1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0], [0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0], ]) metric = metrics.MAPMetric(topk_args=[1, 3, 5, 10]) metric.reset() metric.update(outputs, targets) metric.compute() # ( # # mean for @01, @03, @05, @10 # (0.5, 0.6666666865348816, 0.6416666507720947, 0.5325397253036499), # # std for @01, @03, @05, @10 # (0.0, 0.0, 0.0, 0.0) # ) metric.compute_key_value() # { # 'map': 0.5, # 'map/std': 0.0, # 'map01': 0.5, # 'map01/std': 0.0, # 'map03': 0.6666666865348816, # 'map03/std': 0.0, # 'map05': 0.6416666507720947, # 'map05/std': 0.0, # 'map10': 0.5325397253036499, # 'map10/std': 0.0 # } metric.reset() metric(outputs, targets) # ( # # mean for @01, @03, @05, @10 # (0.5, 0.6666666865348816, 0.6416666507720947, 0.5325397253036499), # # std for @01, @03, @05, @10 # (0.0, 0.0, 0.0, 0.0) # )
import torch from torch.utils.data import DataLoader, TensorDataset from catalyst import dl # sample data num_users, num_features, num_items = int(1e4), int(1e1), 10 X = torch.rand(num_users, num_features) y = (torch.rand(num_users, num_items) > 0.5).to(torch.float32) # pytorch loaders dataset = TensorDataset(X, y) loader = DataLoader(dataset, batch_size=32, num_workers=1) loaders = {"train": loader, "valid": loader} # model, criterion, optimizer, scheduler model = torch.nn.Linear(num_features, num_items) criterion = torch.nn.BCEWithLogitsLoss() optimizer = torch.optim.Adam(model.parameters()) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2]) # model training runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) runner.train( model=model, criterion=criterion, optimizer=optimizer, scheduler=scheduler, loaders=loaders, num_epochs=3, verbose=True, callbacks=[ dl.BatchTransformCallback( transform=torch.sigmoid, scope="on_batch_end", input_key="logits", output_key="scores" ), dl.CriterionCallback( input_key="logits", target_key="targets", metric_key="loss" ), dl.AUCCallback(input_key="scores", target_key="targets"), dl.HitrateCallback( input_key="scores", target_key="targets", topk_args=(1, 3, 5) ), dl.MRRCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.MAPCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.NDCGCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.OptimizerCallback(metric_key="loss"), dl.SchedulerCallback(), dl.CheckpointCallback( logdir="./logs", loader_key="valid", metric_key="loss", minimize=True ), ] )
Note
Please follow the minimal examples sections for more use cases.
RecSys – MRRMetric¶
-
class
catalyst.metrics._mrr.
MRRMetric
(topk_args: List[int] = None, compute_on_call: bool = True, prefix: str = None, suffix: str = None)[source]¶ Bases:
catalyst.metrics._metric.ICallbackBatchMetric
Calculates the Mean Reciprocal Rank (MRR) score given model outputs and targets The precision metric summarizes the fraction of relevant items Computes mean value of map and it’s approximate std value
- Parameters
topk_args – list of topk for mrr@topk computing
compute_on_call – if True, computes and returns metric value during metric call
prefix – metric prefix
suffix – metric suffix
Examples:
import torch from catalyst import metrics outputs = torch.Tensor([ [4.0, 2.0, 3.0, 1.0], [1.0, 2.0, 3.0, 4.0], ]) targets = torch.tensor([ [0, 0, 1.0, 1.0], [0, 0, 1.0, 1.0], ]) metric = metrics.MRRMetric(topk_args=[1, 3]) metric.reset() metric.update(outputs, targets) metric.compute() # ((0.5, 0.75), (0.0, 0.0)) # mean, std for @01, @03 metric.compute_key_value() # { # 'mrr01': 0.5, # 'mrr03': 0.75, # 'mrr': 0.5, # 'mrr01/std': 0.0, # 'mrr03/std': 0.0, # 'mrr/std': 0.0 # } metric.reset() metric(outputs, targets) # ((0.5, 0.75), (0.0, 0.0)) # mean, std for @01, @03
import torch from torch.utils.data import DataLoader, TensorDataset from catalyst import dl # sample data num_users, num_features, num_items = int(1e4), int(1e1), 10 X = torch.rand(num_users, num_features) y = (torch.rand(num_users, num_items) > 0.5).to(torch.float32) # pytorch loaders dataset = TensorDataset(X, y) loader = DataLoader(dataset, batch_size=32, num_workers=1) loaders = {"train": loader, "valid": loader} # model, criterion, optimizer, scheduler model = torch.nn.Linear(num_features, num_items) criterion = torch.nn.BCEWithLogitsLoss() optimizer = torch.optim.Adam(model.parameters()) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2]) # model training runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) runner.train( model=model, criterion=criterion, optimizer=optimizer, scheduler=scheduler, loaders=loaders, num_epochs=3, verbose=True, callbacks=[ dl.BatchTransformCallback( transform=torch.sigmoid, scope="on_batch_end", input_key="logits", output_key="scores" ), dl.CriterionCallback( input_key="logits", target_key="targets", metric_key="loss" ), dl.AUCCallback(input_key="scores", target_key="targets"), dl.HitrateCallback( input_key="scores", target_key="targets", topk_args=(1, 3, 5) ), dl.MRRCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.MAPCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.NDCGCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.OptimizerCallback(metric_key="loss"), dl.SchedulerCallback(), dl.CheckpointCallback( logdir="./logs", loader_key="valid", metric_key="loss", minimize=True ), ] )
Note
Please follow the minimal examples sections for more use cases.
RecSys – NDCGMetric¶
-
class
catalyst.metrics._ndcg.
NDCGMetric
(topk_args: List[int] = None, compute_on_call: bool = True, prefix: str = None, suffix: str = None)[source]¶ Bases:
catalyst.metrics._metric.ICallbackBatchMetric
Calculates the Normalized discounted cumulative gain (NDCG) score given model outputs and targets The precision metric summarizes the fraction of relevant items Computes mean value of NDCG and it’s approximate std value
- Parameters
topk_args – list of topk for ndcg@topk computing
compute_on_call – if True, computes and returns metric value during metric call
prefix – metric prefix
suffix – metric suffix
Examples:
import torch from catalyst import metrics outputs = torch.Tensor([ [0.5, 0.2, 0.1], [0.5, 0.2, 0.1], ]) targets = torch.tensor([ [1.0, 0.0, 1.0], [1.0, 0.0, 1.0], ]) metric = metrics.NDCGMetric(topk_args=[1, 2]) metric.reset() metric.update(outputs, targets) metric.compute() # ( # (1.0, 0.6131471991539001), # mean for @01, @02 # (0.0, 0.0) # std for @01, @02 # ) metric.compute_key_value() # { # 'ndcg01': 1.0, # 'ndcg02': 0.6131471991539001, # 'ndcg': 1.0, # 'ndcg01/std': 0.0, # 'ndcg02/std': 0.0, # 'ndcg/std': 0.0 # } metric.reset() metric(outputs, targets) # ( # (1.0, 0.6131471991539001), # mean for @01, @02 # (0.0, 0.0) # std for @01, @02 # ) # ((0.5, 0.75), (0.0, 0.0)) # mean, std for @01, @03
import torch from torch.utils.data import DataLoader, TensorDataset from catalyst import dl # sample data num_users, num_features, num_items = int(1e4), int(1e1), 10 X = torch.rand(num_users, num_features) y = (torch.rand(num_users, num_items) > 0.5).to(torch.float32) # pytorch loaders dataset = TensorDataset(X, y) loader = DataLoader(dataset, batch_size=32, num_workers=1) loaders = {"train": loader, "valid": loader} # model, criterion, optimizer, scheduler model = torch.nn.Linear(num_features, num_items) criterion = torch.nn.BCEWithLogitsLoss() optimizer = torch.optim.Adam(model.parameters()) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2]) # model training runner = dl.SupervisedRunner( input_key="features", output_key="logits", target_key="targets", loss_key="loss" ) runner.train( model=model, criterion=criterion, optimizer=optimizer, scheduler=scheduler, loaders=loaders, num_epochs=3, verbose=True, callbacks=[ dl.BatchTransformCallback( transform=torch.sigmoid, scope="on_batch_end", input_key="logits", output_key="scores" ), dl.CriterionCallback( input_key="logits", target_key="targets", metric_key="loss" ), dl.AUCCallback(input_key="scores", target_key="targets"), dl.HitrateCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.HitrateCallback( input_key="scores", target_key="targets", topk_args=(1, 3, 5) ), dl.MAPCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.NDCGCallback(input_key="scores", target_key="targets", topk_args=(1, 3, 5)), dl.OptimizerCallback(metric_key="loss"), dl.SchedulerCallback(), dl.CheckpointCallback( logdir="./logs", loader_key="valid", metric_key="loss", minimize=True ), ] )
Note
Please follow the minimal examples sections for more use cases.
Segmentation – RegionBasedMetric¶
-
class
catalyst.metrics._segmentation.
RegionBasedMetric
(metric_fn: Callable, metric_name: str, class_dim: int = 1, weights: Optional[List[float]] = None, class_names: Optional[List[str]] = None, threshold: Optional[float] = 0.5, compute_on_call: bool = True, prefix: Optional[str] = None, suffix: Optional[str] = None)[source]¶ Bases:
catalyst.metrics._metric.ICallbackBatchMetric
Logic class for all region based metrics, like IoU, Dice, Trevsky.
- Parameters
metric_fn – metric function, that get statistics and return score
metric_name – name of the metric
class_dim – indicates class dimension (K) for
outputs
andtargets
tensors (default = 1)weights – class weights
class_names – class names
threshold – threshold for outputs binarization
compute_on_call – Computes and returns metric value during metric call. Used for per-batch logging. default: True
prefix – metric prefix
suffix – metric suffix
Interface, please check out implementations for more details:
Segmentation – DiceMetric¶
-
class
catalyst.metrics._segmentation.
DiceMetric
(class_dim: int = 1, weights: Optional[List[float]] = None, class_names: Optional[List[str]] = None, threshold: Optional[float] = None, eps: float = 1e-07, compute_on_call: bool = True, prefix: Optional[str] = None, suffix: Optional[str] = None)[source]¶ Bases:
catalyst.metrics._segmentation.RegionBasedMetric
Dice Metric, dice score = 2 * intersection / (intersection + union)) = 2 * tp / (2 * tp + fp + fn)
- Parameters
class_dim – indicates class dimention (K) for
outputs
andtensors (targets) –
weights – class weights
class_names – class names
threshold – threshold for outputs binarization
eps – epsilon to avoid zero division
compute_on_call – Computes and returns metric value during metric call. Used for per-batch logging. default: True
prefix – metric prefix
suffix – metric suffix
Examples:
import torch from catalyst import metrics outputs = torch.tensor([[[[0.8, 0.1, 0], [0, 0.4, 0.3], [0, 0, 1]]]]) targets = torch.tensor([[[[1.0, 0, 0], [0, 1, 0], [1, 1, 0]]]]) metric = metrics.DiceMetric() metric.reset() metric.compute() # per_class, micro, macro, weighted # ([tensor(0.3636)], tensor(0.3636), tensor(0.3636), None) metric.update_key_value(outputs, targets) metric.compute_key_value() # { # 'dice': tensor(0.3636), # 'dice/_macro': tensor(0.3636), # 'dice/_micro': tensor(0.3636), # 'dice/class_00': tensor(0.3636), # }
import os import torch from torch import nn from torch.utils.data import DataLoader from catalyst import dl from catalyst.data import ToTensor from catalyst.contrib.datasets import MNIST from catalyst.contrib.nn import IoULoss model = nn.Sequential( nn.Conv2d(1, 1, 3, 1, 1), nn.ReLU(), nn.Conv2d(1, 1, 3, 1, 1), nn.Sigmoid(), ) criterion = IoULoss() optimizer = torch.optim.Adam(model.parameters(), lr=0.02) loaders = { "train": DataLoader( MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32 ), "valid": DataLoader( MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32 ), } class CustomRunner(dl.SupervisedRunner): def handle_batch(self, batch): x = batch[self._input_key] x_noise = (x + torch.rand_like(x)).clamp_(0, 1) x_ = self.model(x_noise) self.batch = {self._input_key: x, self._output_key: x_, self._target_key: x} runner = CustomRunner( input_key="features", output_key="scores", target_key="targets", loss_key="loss" ) # model training runner.train( model=model, criterion=criterion, optimizer=optimizer, loaders=loaders, num_epochs=1, callbacks=[ dl.IOUCallback(input_key="scores", target_key="targets"), dl.DiceCallback(input_key="scores", target_key="targets"), dl.TrevskyCallback(input_key="scores", target_key="targets", alpha=0.2), ], logdir="./logdir", valid_loader="valid", valid_metric="loss", minimize_valid_metric=True, verbose=True, )
Note
Please follow the minimal examples sections for more use cases.
Segmentation – IOUMetric¶
-
class
catalyst.metrics._segmentation.
IOUMetric
(class_dim: int = 1, weights: Optional[List[float]] = None, class_names: Optional[List[str]] = None, threshold: Optional[float] = None, eps: float = 1e-07, compute_on_call: bool = True, prefix: Optional[str] = None, suffix: Optional[str] = None)[source]¶ Bases:
catalyst.metrics._segmentation.RegionBasedMetric
IoU Metric, iou score = intersection / union = tp / (tp + fp + fn).
- Parameters
class_dim – indicates class dimension (K) for
outputs
andtargets
tensors (default = 1)weights – class weights
class_names – class names
threshold – threshold for outputs binarization
eps – epsilon to avoid zero division
compute_on_call – Computes and returns metric value during metric call. Used for per-batch logging. default: True
prefix – metric prefix
suffix – metric suffix
Examples:
import torch from catalyst import metrics outputs = torch.tensor([[[[0.8, 0.1, 0], [0, 0.4, 0.3], [0, 0, 1]]]]) targets = torch.tensor([[[[1.0, 0, 0], [0, 1, 0], [1, 1, 0]]]]) metric = metrics.IOUMetric() metric.reset() metric.compute() # per_class, micro, macro, weighted # ([tensor(0.2222)], tensor(0.2222), tensor(0.2222), None) metric.update_key_value(outputs, targets) metric.compute_key_value() # { # 'iou': tensor(0.2222), # 'iou/_macro': tensor(0.2222), # 'iou/_micro': tensor(0.2222), # 'iou/class_00': tensor(0.2222), # }
import os import torch from torch import nn from torch.utils.data import DataLoader from catalyst import dl from catalyst.data import ToTensor from catalyst.contrib.datasets import MNIST from catalyst.contrib.nn import IoULoss model = nn.Sequential( nn.Conv2d(1, 1, 3, 1, 1), nn.ReLU(), nn.Conv2d(1, 1, 3, 1, 1), nn.Sigmoid(), ) criterion = IoULoss() optimizer = torch.optim.Adam(model.parameters(), lr=0.02) loaders = { "train": DataLoader( MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32 ), "valid": DataLoader( MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32 ), } class CustomRunner(dl.SupervisedRunner): def handle_batch(self, batch): x = batch[self._input_key] x_noise = (x + torch.rand_like(x)).clamp_(0, 1) x_ = self.model(x_noise) self.batch = {self._input_key: x, self._output_key: x_, self._target_key: x} runner = CustomRunner( input_key="features", output_key="scores", target_key="targets", loss_key="loss" ) # model training runner.train( model=model, criterion=criterion, optimizer=optimizer, loaders=loaders, num_epochs=1, callbacks=[ dl.IOUCallback(input_key="scores", target_key="targets"), dl.DiceCallback(input_key="scores", target_key="targets"), dl.TrevskyCallback(input_key="scores", target_key="targets", alpha=0.2), ], logdir="./logdir", valid_loader="valid", valid_metric="loss", minimize_valid_metric=True, verbose=True, )
Note
Please follow the minimal examples sections for more use cases.
Segmentation – TrevskyMetric¶
-
class
catalyst.metrics._segmentation.
TrevskyMetric
(alpha: float, beta: Optional[float] = None, class_dim: int = 1, weights: Optional[List[float]] = None, class_names: Optional[List[str]] = None, threshold: Optional[float] = None, eps: float = 1e-07, compute_on_call: bool = True, prefix: Optional[str] = None, suffix: Optional[str] = None)[source]¶ Bases:
catalyst.metrics._segmentation.RegionBasedMetric
Trevsky Metric, trevsky score = tp / (tp + fp * beta + fn * alpha)
- Parameters
alpha – false negative coefficient, bigger alpha bigger penalty for false negative. if beta is None, alpha must be in (0, 1)
beta – false positive coefficient, bigger alpha bigger penalty for false positive. Must be in (0, 1), if None beta = (1 - alpha)
class_dim – indicates class dimension (K) for
outputs
andtargets
tensors (default = 1)weights – class weights
class_names – class names
threshold – threshold for outputs binarization
eps – epsilon to avoid zero division
compute_on_call – Computes and returns metric value during metric call. Used for per-batch logging. default: True
prefix – metric prefix
suffix – metric suffix
Examples:
import torch from catalyst import metrics outputs = torch.tensor([[[[0.8, 0.1, 0], [0, 0.4, 0.3], [0, 0, 1]]]]) targets = torch.tensor([[[[1.0, 0, 0], [0, 1, 0], [1, 1, 0]]]]) metric = metrics.TrevskyMetric(alpha=0.2) metric.reset() metric.compute() # per_class, micro, macro, weighted # ([tensor(0.4167)], tensor(0.4167), tensor(0.4167), None) metric.update_key_value(outputs, targets) metric.compute_key_value() # { # 'trevsky': tensor(0.4167), # 'trevsky/_macro': tensor(0.4167) # 'trevsky/_micro': tensor(0.4167), # 'trevsky/class_00': tensor(0.4167), # }
import os import torch from torch import nn from torch.utils.data import DataLoader from catalyst import dl from catalyst.data import ToTensor from catalyst.contrib.datasets import MNIST from catalyst.contrib.nn import IoULoss model = nn.Sequential( nn.Conv2d(1, 1, 3, 1, 1), nn.ReLU(), nn.Conv2d(1, 1, 3, 1, 1), nn.Sigmoid(), ) criterion = IoULoss() optimizer = torch.optim.Adam(model.parameters(), lr=0.02) loaders = { "train": DataLoader( MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32 ), "valid": DataLoader( MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32 ), } class CustomRunner(dl.SupervisedRunner): def handle_batch(self, batch): x = batch[self._input_key] x_noise = (x + torch.rand_like(x)).clamp_(0, 1) x_ = self.model(x_noise) self.batch = {self._input_key: x, self._output_key: x_, self._target_key: x} runner = CustomRunner( input_key="features", output_key="scores", target_key="targets", loss_key="loss" ) # model training runner.train( model=model, criterion=criterion, optimizer=optimizer, loaders=loaders, num_epochs=1, callbacks=[ dl.IOUCallback(input_key="scores", target_key="targets"), dl.DiceCallback(input_key="scores", target_key="targets"), dl.TrevskyCallback(input_key="scores", target_key="targets", alpha=0.2), ], logdir="./logdir", valid_loader="valid", valid_metric="loss", minimize_valid_metric=True, verbose=True, )
Note
Please follow the minimal examples sections for more use cases.
Functional API¶
Accuracy¶
-
catalyst.metrics.functional._accuracy.
accuracy
(outputs: torch.Tensor, targets: torch.Tensor, topk: Sequence[int] = (1, )) → Sequence[torch.Tensor][source]¶ Computes multiclass accuracy@topk for the specified values of topk.
- Parameters
outputs – model outputs, logits with shape [bs; num_classes]
targets – ground truth, labels with shape [bs; 1]
topk – topk for accuracy@topk computing
- Returns
list with computed accuracy@topk
Examples:
import torch from catalyst import metrics metrics.accuracy( outputs=torch.tensor([ [1, 0, 0], [0, 1, 0], [0, 0, 1], ]), targets=torch.tensor([0, 1, 2]), ) # [tensor([1.])]
import torch from catalyst import metrics metrics.accuracy( outputs=torch.tensor([ [1, 0, 0], [0, 1, 0], [0, 1, 0], ]), targets=torch.tensor([0, 1, 2]), ) # [tensor([0.6667])]
import torch from catalyst import metrics metrics.accuracy( outputs=torch.tensor([ [1, 0, 0], [0, 1, 0], [0, 0, 1], ]), targets=torch.tensor([0, 1, 2]), topk=[1, 3], ) # [tensor([1.]), tensor([1.])]
import torch from catalyst import metrics metrics.accuracy( outputs=torch.tensor([ [1, 0, 0], [0, 1, 0], [0, 1, 0], ]), targets=torch.tensor([0, 1, 2]), topk=[1, 3], ) # [tensor([0.6667]), tensor([1.])]
-
catalyst.metrics.functional._accuracy.
multilabel_accuracy
(outputs: torch.Tensor, targets: torch.Tensor, threshold: Union[float, torch.Tensor]) → torch.Tensor[source]¶ Computes multilabel accuracy for the specified activation and threshold.
- Parameters
outputs – NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.
targets – binary NxK tensort that encodes which of the K classes are associated with the N-th input (eg: a row [0, 1, 0, 1] indicates that the example is associated with classes 2 and 4)
threshold – threshold for for model output
- Returns
computed multilabel accuracy
Examples:
import torch from catalyst import metrics metrics.multilabel_accuracy( outputs=torch.tensor([ [1, 0], [0, 1], ]), targets=torch.tensor([ [1, 0], [0, 1], ]), threshold=0.5, ) # tensor([1.])
import torch from catalyst import metrics metrics.multilabel_accuracy( outputs=torch.tensor([ [1.0, 0.0], [0.6, 1.0], ]), targets=torch.tensor([ [1, 0], [0, 1], ]), threshold=0.5, ) # tensor(0.7500)
import torch from catalyst import metrics metrics.multilabel_accuracy( outputs=torch.tensor([ [1.0, 0.0], [0.4, 1.0], ]), targets=torch.tensor([ [1, 0], [0, 1], ]), threshold=0.5, ) # tensor(1.0)
AUC¶
-
catalyst.metrics.functional._auc.
binary_auc
(scores: torch.Tensor, targets: torch.Tensor) → Tuple[float, numpy.ndarray, numpy.ndarray][source]¶ Binary AUC computation.
- Parameters
scores – estimated scores from a model.
targets – ground truth (correct) target values.
- Returns
measured roc-auc, true positive rate, false positive rate
- Return type
Tuple[float, np.ndarray, np.ndarray]
Warning
This metric is under API improvement.
Example:
import torch from catalyst import metrics metrics.binary_auc( scores=torch.tensor([ 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.0, ]), targets=torch.tensor([ 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, ]), ) # 0.7500, # [0. , 0. , 0.16, 0.33, 0.5 , 0.66, 0.83, 0.83, 1. , 1. , 1. ], # [0. , 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.5, 0.75, 1. ]
-
catalyst.metrics.functional._auc.
auc
(scores: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]¶ Computes ROC-AUC.
- Parameters
scores – NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.
targets – binary NxK tensort that encodes which of the K classes are associated with the N-th input (eg: a row [0, 1, 0, 1] indicates that the example is associated with classes 2 and 4)
- Returns
Tensor with [num_classes] shape of per-class-aucs
- Return type
torch.Tensor
Examples:
import torch from catalyst import metrics metrics.auc( scores=torch.tensor([ [0.9, 0.1], [0.1, 0.9], ]), targets=torch.tensor([ [1, 0], [0, 1], ]), ) # tensor([1., 1.])
from catalyst import metrics metrics.auc( scores=torch.tensor([ [0.9], [0.8], [0.7], [0.6], [0.5], [0.4], [0.3], [0.2], [0.1], [0.0], ]), targets=torch.tensor([ [0], [1], [1], [1], [1], [1], [1], [0], [0], [0], ]), ) # tensor([0.7500])
Warning
This metric is under API improvement.
Average Precision¶
-
catalyst.metrics.functional._average_precision.
binary_average_precision
(outputs: torch.Tensor, targets: torch.Tensor, weights: Optional[torch.Tensor] = None) → torch.Tensor[source]¶ Computes the binary average precision.
- Parameters
outputs – NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.
targets – binary NxK tensort that encodes which of the K classes are associated with the N-th input (eg: a row [0, 1, 0, 1] indicates that the example is associated with classes 2 and 4)
weights – importance for each sample
- Returns
tensor of [K; ] shape, with average precision for K classes
- Return type
torch.Tensor
Example:
import torch from catalyst import metrics metrics.binary_average_precision( outputs=torch.Tensor([0.1, 0.4, 0.35, 0.8]), targets=torch.Tensor([0, 0, 1, 1]), ) # tensor([0.8333])
-
catalyst.metrics.functional._average_precision.
mean_average_precision
(outputs: torch.Tensor, targets: torch.Tensor, topk: List[int]) → List[torch.Tensor][source]¶ Calculate the mean average precision (MAP) for RecSys. The metrics calculate the mean of the AP across all batches
MAP amplifies the interest in finding many relevant items for each query
- Parameters
outputs (torch.Tensor) – Tensor with predicted score size: [batch_size, slate_length] model outputs, logits
targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant and 0 not relevant size: [batch_szie, slate_length] ground truth, labels
topk (List[int]) – List of parameter for evaluation topK items
- Returns
The map score for every k. size: len(top_k)
- Return type
map_at_k (Tuple[float])
Example:
import torch from catalyst import metrics metrics.mean_average_precision( outputs=torch.tensor([ [9, 8, 7, 6, 5, 4, 3, 2, 1, 0], [9, 8, 7, 6, 5, 4, 3, 2, 1, 0], ]), targets=torch.tensor([ [1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0], [0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0], ]), topk=[1, 3, 5, 10], ) # [tensor(0.5000), tensor(0.6667), tensor(0.6417), tensor(0.5325)]
-
catalyst.metrics.functional._average_precision.
average_precision
(outputs: torch.Tensor, targets: torch.Tensor, k: int) → torch.Tensor[source]¶ Calculate the Average Precision for RecSys. The precision metric summarizes the fraction of relevant items out of the whole the recommendation list.
To compute the precision at k set the threshold rank k, compute the percentage of relevant items in topK, ignoring the documents ranked lower than k.
The average precision at k (AP at k) summarizes the average precision for relevant items up to the k-th one. Wikipedia entry for the Average precision
<https://en.wikipedia.org/w/index.php?title=Information_retrieval& oldid=793358396#Average_precision>
If a relevant document never gets retrieved, we assume the precision corresponding to that relevant doc to be zero
- Parameters
outputs (torch.Tensor) – Tensor with predicted score size: [batch_size, slate_length] model outputs, logits
targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant and 0 not relevant size: [batch_szie, slate_length] ground truth, labels
k – Parameter for evaluation on top-k items
- Returns
The map score for each batch. size: [batch_size, 1]
- Return type
ap_score (torch.Tensor)
Example:
import torch from catalyst import metrics metrics.average_precision( outputs=torch.tensor([ [9, 8, 7, 6, 5, 4, 3, 2, 1, 0], [9, 8, 7, 6, 5, 4, 3, 2, 1, 0], ]), targets=torch.tensor([ [1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0], [0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0], ]), k=10, ) # tensor([0.6222, 0.4429])
Classification¶
-
catalyst.metrics.functional._classification.
f1score
(precision_value, recall_value, eps=1e-05)[source]¶ Calculating F1-score from precision and recall to reduce computation redundancy.
- Parameters
precision_value – precision (0-1)
recall_value – recall (0-1)
eps – epsilon to use
- Returns
F1 score (0-1)
-
catalyst.metrics.functional._classification.
precision_recall_fbeta_support
(outputs: torch.Tensor, targets: torch.Tensor, beta: float = 1, eps: float = 1e-06, argmax_dim: int = -1, num_classes: Optional[int] = None, zero_division: int = 0) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]¶ Counts precision_val, recall, fbeta_score.
- Parameters
outputs – A list of predicted elements
targets – A list of elements that are to be predicted
beta – beta param for f_score
eps – epsilon to avoid zero division
argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in
outputs
num_classes – int, that specifies number of classes if it known.
zero_division – int value, should be one of 0 or 1; used for precision_val and recall computation
- Returns
tuple of precision_val, recall, fbeta_score
Examples:
import torch from catalyst import metrics metrics.precision_recall_fbeta_support( outputs=torch.tensor([ [1, 0, 0], [0, 1, 0], [0, 0, 1], ]), targets=torch.tensor([0, 1, 2]), beta=1, ) # ( # tensor([1., 1., 1.]), # per class precision # tensor([1., 1., 1.]), # per class recall # tensor([1., 1., 1.]), # per class fbeta # tensor([1., 1., 1.]), # per class support # )
import torch from catalyst import metrics metrics.precision_recall_fbeta_support( outputs=torch.tensor([[0, 0, 1, 1, 0, 1, 0, 1]]), targets=torch.tensor([[0, 1, 0, 1, 0, 0, 1, 1]]), beta=1, ) # ( # tensor([0.5000, 0.5000]), # per class precision # tensor([0.5000, 0.5000]), # per class recall # tensor([0.5000, 0.5000]), # per class fbeta # tensor([4., 4.]), # per class support # )
-
catalyst.metrics.functional._classification.
precision
(tp: int, fp: int, zero_division: int = 0) → float[source]¶ Calculates precision (a.k.a. positive predictive value) for binary classification and segmentation.
- Parameters
tp – number of true positives
fp – number of false positives
zero_division – int value, should be one of 0 or 1; if both tp==0 and fp==0 return this value as s result
- Returns
precision value (0-1)
-
catalyst.metrics.functional._classification.
recall
(tp: int, fn: int, zero_division: int = 0) → float[source]¶ Calculates recall (a.k.a. true positive rate) for binary classification and segmentation.
- Parameters
tp – number of true positives
fn – number of false negatives
zero_division – int value, should be one of 0 or 1; if both tp==0 and fn==0 return this value as s result
- Returns
recall value (0-1)
-
catalyst.metrics.functional._classification.
get_aggregated_metrics
(tp: numpy.array, fp: numpy.array, fn: numpy.array, support: numpy.array, zero_division: int = 0) → Tuple[numpy.array, numpy.array, numpy.array, numpy.array][source]¶ Count precision, recall, f1 scores per-class and with macro, weighted and micro average with statistics.
- Parameters
tp – array of shape (num_classes, ) of true positive statistics per class
fp – array of shape (num_classes, ) of false positive statistics per class
fn – array of shape (num_classes, ) of false negative statistics per class
support – array of shape (num_classes, ) of samples count per class
zero_division – int value, should be one of 0 or 1; used for precision and recall computation
- Returns
per-class, micro, macro, weighted averaging
- Return type
arrays of metrics
-
catalyst.metrics.functional._classification.
get_binary_metrics
(tp: int, fp: int, fn: int, zero_division: int) → Tuple[float, float, float][source]¶ - Get precision, recall, f1 score metrics from true positive, false positive,
false negative statistics for binary classification
- Parameters
tp – true positive
fp – false positive
fn – false negative
zero_division – int value, should be 0 or 1
- Returns
precision, recall, f1 scores
CMC Score¶
-
catalyst.metrics.functional._cmc_score.
cmc_score_count
(distances: torch.Tensor, conformity_matrix: torch.Tensor, topk: int = 1) → float[source]¶ Function to count CMC from distance matrix and conformity matrix.
- Parameters
distances – distance matrix shape of (n_embeddings_x, n_embeddings_y)
conformity_matrix – binary matrix with 1 on same label pos and 0 otherwise
topk – number of top examples for cumulative score counting
- Returns
cmc score
Examples:
import torch from catalyst import metrics metrics.cmc_score_count( distances=torch.tensor([[1, 2], [2, 1]]), conformity_matrix=torch.tensor([[0, 1], [1, 0]]), topk=1, ) # 0.0
import torch from catalyst import metrics metrics.cmc_score_count( distances=torch.tensor([[1, 0.5, 0.2], [2, 3, 4], [0.4, 3, 4]]), conformity_matrix=torch.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]]), topk=2, ) # 0.33
-
catalyst.metrics.functional._cmc_score.
cmc_score
(query_embeddings: torch.Tensor, gallery_embeddings: torch.Tensor, conformity_matrix: torch.Tensor, topk: int = 1) → float[source]¶ Function to count CMC score from query and gallery embeddings.
- Parameters
query_embeddings – tensor shape of (n_embeddings, embedding_dim) embeddings of the objects in query
gallery_embeddings – tensor shape of (n_embeddings, embedding_dim) embeddings of the objects in gallery
conformity_matrix – binary matrix with 1 on same label pos and 0 otherwise
topk – number of top examples for cumulative score counting
- Returns
cmc score
Example:
import torch from catalyst import metrics metrics.cmc_score( query_embeddings=torch.tensor([ [1, 1, 0, 0], [1, 0, 0, 0], [0, 1, 1, 1], [0, 0, 1, 1], ]).float(), gallery_embeddings=torch.tensor([ [1, 1, 1, 0], [1, 1, 1, 1], [0, 1, 1, 0], ]).float(), conformity_matrix=torch.tensor([ [True, False, False], [True, False, False], [False, True, True], [False, True, True], ]), topk=1, ) # 1.0
-
catalyst.metrics.functional._cmc_score.
masked_cmc_score
(query_embeddings: torch.Tensor, gallery_embeddings: torch.Tensor, conformity_matrix: torch.Tensor, available_samples: torch.Tensor, topk: int = 1) → float[source]¶ - Parameters
query_embeddings – tensor shape of (n_embeddings, embedding_dim) embeddings of the objects in query
gallery_embeddings – tensor shape of (n_embeddings, embedding_dim) embeddings of the objects in gallery
conformity_matrix – binary matrix with 1 on same label pos and 0 otherwise
available_samples – tensor of shape (query_size, gallery_size), available_samples[i][j] == 1 means that j-th element of gallery should be used while scoring i-th query one
topk – number of top examples for cumulative score counting
- Returns
cmc score with mask
- Raises
ValueError – if there are items that have different labels and are unavailable for each other according to availability matrix
Example:
import torch from catalyst import metrics metrics.masked_cmc_score( query_embeddings=torch.tensor([ [1, 1, 0, 0], [1, 0, 0, 0], [0, 1, 1, 1], [0, 0, 1, 1], ]).float(), gallery_embeddings=torch.tensor([ [1, 1, 1, 0], [1, 1, 1, 1], [0, 1, 1, 0], ]).float(), conformity_matrix=torch.tensor([ [True, False, False], [True, False, False], [False, True, True], [False, True, True], ]), available_samples=torch.tensor([ [False, True, True], [True, True, True], [True, False, True], [True, True, True], ]), topk=1, ) # 0.75
F1 score¶
-
catalyst.metrics.functional._f1_score.
f1_score
(outputs: torch.Tensor, targets: torch.Tensor, eps: float = 1e-07, argmax_dim: int = -1, num_classes: Optional[int] = None) → Union[float, torch.Tensor][source]¶ Fbeta_score with beta=1.
- Parameters
outputs – A list of predicted elements
targets – A list of elements that are to be predicted
eps – epsilon to avoid zero division
argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in
outputs
num_classes – int, that specifies number of classes if it known
- Returns
F_1 score
- Return type
float
Example:
import torch from catalyst import metrics metrics.f1_score( outputs=torch.tensor([ [1, 0, 0], [0, 1, 0], [0, 0, 1], ]), targets=torch.tensor([0, 1, 2]), ) # tensor([1., 1., 1.]), # per class fbeta
-
catalyst.metrics.functional._f1_score.
fbeta_score
(outputs: torch.Tensor, targets: torch.Tensor, beta: float = 1.0, eps: float = 1e-07, argmax_dim: int = -1, num_classes: Optional[int] = None) → Union[float, torch.Tensor][source]¶ Counts fbeta score for given
outputs
andtargets
.- Parameters
outputs – A list of predicted elements
targets – A list of elements that are to be predicted
beta – beta param for f_score
eps – epsilon to avoid zero division
argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in
outputs
num_classes – int, that specifies number of classes if it known
- Raises
ValueError – If
beta
is a negative number.- Returns
F_beta score.
- Return type
float
Example:
import torch from catalyst import metrics metrics.fbeta_score( outputs=torch.tensor([ [1, 0, 0], [0, 1, 0], [0, 0, 1], ]), targets=torch.tensor([0, 1, 2]), beta=1, ) # tensor([1., 1., 1.]), # per class fbeta
Focal¶
-
catalyst.metrics.functional._focal.
sigmoid_focal_loss
(outputs: torch.Tensor, targets: torch.Tensor, gamma: float = 2.0, alpha: float = 0.25, reduction: str = 'mean')[source]¶ Compute binary focal loss between target and output logits.
- Parameters
outputs – tensor of arbitrary shape
targets – tensor of the same shape as input
gamma – gamma for focal loss
alpha – alpha for focal loss
reduction (string, optional) – specifies the reduction to apply to the output:
"none"
|"mean"
|"sum"
|"batchwise_mean"
."none"
: no reduction will be applied,"mean"
: the sum of the output will be divided by the number of elements in the output,"sum"
: the output will be summed.
- Returns
computed loss
-
catalyst.metrics.functional._focal.
reduced_focal_loss
(outputs: torch.Tensor, targets: torch.Tensor, threshold: float = 0.5, gamma: float = 2.0, reduction='mean') → torch.Tensor[source]¶ Compute reduced focal loss between target and output logits.
It has been proposed in Reduced Focal Loss: 1st Place Solution to xView object detection in Satellite Imagery paper.
Note
size_average
andreduce
params are in the process of being deprecated, and in the meantime, specifying either of those two args will overridereduction
.Source: https://github.com/BloodAxe/pytorch-toolbelt
- Parameters
outputs – tensor of arbitrary shape
targets – tensor of the same shape as input
threshold – threshold for focal reduction
gamma – gamma for focal reduction
reduction – specifies the reduction to apply to the output:
"none"
|"mean"
|"sum"
|"batchwise_mean"
."none"
: no reduction will be applied,"mean"
: the sum of the output will be divided by the number of elements in the output,"sum"
: the output will be summed."batchwise_mean"
computes mean loss per sample in batch. Default: “mean”
- Returns: # noqa: DAR201
torch.Tensor: computed loss
Hitrate¶
-
catalyst.metrics.functional._hitrate.
hitrate
(outputs: torch.Tensor, targets: torch.Tensor, topk: List[int], zero_division: int = 0) → List[torch.Tensor][source]¶ Calculate the hit rate (aka recall) score given model outputs and targets. Hit-rate is a metric for evaluating ranking systems. Generate top-N recommendations and if one of the recommendation is actually what user has rated, you consider that a hit. By rate we mean any explicit form of user’s interactions. Add up all of the hits for all users and then divide by number of users
Compute top-N recommendation for each user in the training stage and intentionally remove one of this items fro the training data.
- Parameters
outputs (torch.Tensor) – Tensor with predicted score size: [batch_size, slate_length] model outputs, logits
targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant for the user and 0 not relevant size: [batch_size, slate_length] ground truth, labels
topk (List[int]) – Parameter fro evaluation on top-k items
zero_division (int) – value, returns in the case of the divison by zero should be one of 0 or 1
- Returns
the hitrate score
- Return type
hitrate_at_k (List[torch.Tensor])
Example:
import torch from catalyst import metrics metrics.hitrate( outputs=torch.Tensor([[4.0, 2.0, 3.0, 1.0], [1.0, 2.0, 3.0, 4.0]]), targets=torch.Tensor([[0, 0, 1.0, 1.0], [0, 0, 0.0, 0.0]]), topk=[1, 2, 3, 4], ) # [tensor(0.), tensor(0.2500), tensor(0.2500), tensor(0.5000)]
MRR¶
-
catalyst.metrics.functional._mrr.
reciprocal_rank
(outputs: torch.Tensor, targets: torch.Tensor, k: int) → torch.Tensor[source]¶ Calculate the Reciprocal Rank (MRR) score given model outputs and targets Data aggregated in batches.
- Parameters
outputs – Tensor weith predicted score size: [batch_size, slate_length] model outputs, logits
targets – Binary tensor with ground truth. 1 means the item is relevant and 0 if it’s not relevant size: [batch_size, slate_length] ground truth, labels
k – Parameter for evaluation on top-k items
- Returns
MRR score
Examples:
import torch from catalyst import metrics metrics.reciprocal_rank( outputs=torch.Tensor([ [4.0, 2.0, 3.0, 1.0], [1.0, 2.0, 3.0, 4.0], ]), targets=torch.Tensor([ [0, 0, 1.0, 1.0], [0, 0, 1.0, 1.0], ]), k=1, ) # tensor([[0.], [1.]])
import torch from catalyst import metrics metrics.reciprocal_rank( outputs=torch.Tensor([ [4.0, 2.0, 3.0, 1.0], [1.0, 2.0, 3.0, 4.0], ]), targets=torch.Tensor([ [0, 0, 1.0, 1.0], [0, 0, 1.0, 1.0], ]), k=3, ) # tensor([[0.5000], [1.0000]])
-
catalyst.metrics.functional._mrr.
mrr
(outputs: torch.Tensor, targets: torch.Tensor, topk: List[int]) → List[torch.Tensor][source]¶ Calculate the Mean Reciprocal Rank (MRR) score given model outputs and targets Data aggregated in batches.
The MRR@k is the mean overall batch of the reciprocal rank, that is the rank of the highest ranked relevant item, if any in the top k, 0 otherwise. https://en.wikipedia.org/wiki/Mean_reciprocal_rank
- Parameters
outputs – Tensor weith predicted score size: [batch_size, slate_length] model outputs, logits
targets – Binary tensor with ground truth. 1 means the item is relevant and 0 if it’s not relevant size: [batch_szie, slate_length] ground truth, labels
topk – Parameter fro evaluation on top-k items
- Returns
MRR score
Example:
import torch from catalyst import metrics metrics.mrr( outputs=torch.Tensor([ [4.0, 2.0, 3.0, 1.0], [1.0, 2.0, 3.0, 4.0], ]), targets=torch.Tensor([ [0, 0, 1.0, 1.0], [0, 0, 1.0, 1.0], ]), topk=[1, 3], ) # [tensor(0.5000), tensor(0.7500)]
NDCG¶
-
catalyst.metrics.functional._ndcg.
dcg
(outputs: torch.Tensor, targets: torch.Tensor, gain_function='exp_rank') → torch.Tensor[source]¶ Computes Discounted cumulative gain (DCG) DCG@topk for the specified values of k. Graded relevance as a measure of usefulness, or gain, from examining a set of items. Gain may be reduced at lower ranks. Reference: https://en.wikipedia.org/wiki/Discounted_cumulative_gain
- Parameters
outputs – model outputs, logits with shape [batch_size; slate_length]
targets – ground truth, labels with shape [batch_size; slate_length]
gain_function – String indicates the gain function for the ground truth labels. Two options available: - exp_rank: torch.pow(2, x) - 1 - linear_rank: x On the default, exp_rank is used to emphasize on retrieving the relevant documents.
- Returns
The discounted gains tensor
- Return type
dcg_score (torch.Tensor)
- Raises
ValueError – gain function can be either pow_rank or rank
Examples:
from catalyst import metrics metrics.dcg( outputs = torch.tensor([ [3, 2, 1, 0], ]), targets = torch.Tensor([ [2.0, 2.0, 1.0, 0.0], ]), gain_function="linear_rank", ) # tensor([[2.0000, 2.0000, 0.6309, 0.0000]])
from catalyst import metrics metrics.dcg( outputs = torch.tensor([ [3, 2, 1, 0], ]), targets = torch.Tensor([ [2.0, 2.0, 1.0, 0.0], ]), gain_function="linear_rank", ).sum() # tensor(4.6309)
from catalyst import metrics metrics.dcg( outputs = torch.tensor([ [3, 2, 1, 0], ]), targets = torch.Tensor([ [2.0, 2.0, 1.0, 0.0], ]), gain_function="exp_rank", ) # tensor([[3.0000, 1.8928, 0.5000, 0.0000]])
from catalyst import metrics metrics.dcg( outputs = torch.tensor([ [3, 2, 1, 0], ]), targets = torch.Tensor([ [2.0, 2.0, 1.0, 0.0], ]), gain_function="exp_rank", ).sum() # tensor(5.3928)
-
catalyst.metrics.functional._ndcg.
ndcg
(outputs: torch.Tensor, targets: torch.Tensor, topk: List[int], gain_function='exp_rank') → List[torch.Tensor][source]¶ Computes nDCG@topk for the specified values of topk.
- Parameters
outputs (torch.Tensor) – model outputs, logits with shape [batch_size; slate_size]
targets (torch.Tensor) – ground truth, labels with shape [batch_size; slate_size]
gain_function – callable, gain function for the ground truth labels. Two options available: - exp_rank: torch.pow(2, x) - 1 - linear_rank: x On the default, exp_rank is used to emphasize on retrieving the relevant documents.
topk (List[int]) – Parameter fro evaluation on top-k items
- Returns
tuple with computed ndcg@topk
- Return type
results (Tuple[float])
Examples:
import torch from catalyst import metrics metrics.ndcg( outputs = torch.tensor([ [0.5, 0.2, 0.1], [0.5, 0.2, 0.1], ]), targets = torch.Tensor([ [1.0, 0.0, 1.0], [1.0, 0.0, 1.0], ]), topk=[2], gain_function="exp_rank", ) # [tensor(0.6131)]
import torch from catalyst import metrics metrics.ndcg( outputs = torch.tensor([ [0.5, 0.2, 0.1], [0.5, 0.2, 0.1], ]), targets = torch.Tensor([ [1.0, 0.0, 1.0], [1.0, 0.0, 1.0], ]), topk=[2], gain_function="exp_rank", ) # [tensor(0.5000)]
Precision¶
-
catalyst.metrics.functional._precision.
precision
(outputs: torch.Tensor, targets: torch.Tensor, argmax_dim: int = -1, eps: float = 1e-07, num_classes: Optional[int] = None) → Union[float, torch.Tensor][source]¶ Multiclass precision score.
- Parameters
outputs – estimated targets as predicted by a model with shape [bs; …, (num_classes or 1)]
targets – ground truth (correct) target values with shape [bs; …, 1]
argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in
outputs
eps – float. Epsilon to avoid zero division.
num_classes – int, that specifies number of classes if it known
- Returns
precision for every class
- Return type
Tensor
Examples:
import torch from catalyst import metrics metrics.precision( outputs=torch.tensor([ [1, 0, 0], [0, 1, 0], [0, 0, 1], ]), targets=torch.tensor([0, 1, 2]), ) # tensor([1., 1., 1.])
import torch from catalyst import metrics metrics.precision( outputs=torch.tensor([[0, 0, 1, 1, 0, 1, 0, 1]]), targets=torch.tensor([[0, 1, 0, 1, 0, 0, 1, 1]]), ) # tensor([0.5000, 0.5000]
Recall¶
-
catalyst.metrics.functional._recall.
recall
(outputs: torch.Tensor, targets: torch.Tensor, argmax_dim: int = -1, eps: float = 1e-07, num_classes: Optional[int] = None) → Union[float, torch.Tensor][source]¶ Multiclass recall score.
- Parameters
outputs – estimated targets as predicted by a model with shape [bs; …, (num_classes or 1)]
targets – ground truth (correct) target values with shape [bs; …, 1]
argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in
outputs
eps – float. Epsilon to avoid zero division.
num_classes – int, that specifies number of classes if it known
- Returns
recall for every class
- Return type
Tensor
Examples:
import torch from catalyst import metrics metrics.recall( outputs=torch.tensor([ [1, 0, 0], [0, 1, 0], [0, 0, 1], ]), targets=torch.tensor([0, 1, 2]), ) # tensor([1., 1., 1.])
import torch from catalyst import metrics metrics.recall( outputs=torch.tensor([[0, 0, 1, 1, 0, 1, 0, 1]]), targets=torch.tensor([[0, 1, 0, 1, 0, 0, 1, 1]]), ) # tensor([0.5000, 0.5000]
Segmentation¶
-
catalyst.metrics.functional._segmentation.
iou
(outputs: torch.Tensor, targets: torch.Tensor, class_dim: int = 1, threshold: float = None, mode: str = 'per-class', weights: Optional[List[float]] = None, eps: float = 1e-07) → torch.Tensor[source]¶ Computes the iou/jaccard score, iou score = intersection / union = tp / (tp + fp + fn)
- Parameters
outputs – [N; K; …] tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.
targets – binary [N; K; …] tensor that encodes which of the K classes are associated with the N-th input
class_dim – indicates class dimention (K) for
outputs
andtargets
tensors (default = 1), if mode = “micro” means nothingthreshold – threshold for outputs binarization
mode – class summation strategy. Must be one of [‘micro’, ‘macro’, ‘weighted’, ‘per-class’]. If mode=’micro’, classes are ignored, and metric are calculated generally. If mode=’macro’, metric are calculated per-class and than are averaged over all classes. If mode=’weighted’, metric are calculated per-class and than summed over all classes with weights. If mode=’per-class’, metric are calculated separately for all classes
weights – class weights(for mode=”weighted”)
eps – epsilon to avoid zero division
- Returns
IoU (Jaccard) score for each class(if mode=’weighted’) or aggregated IOU
Example:
import torch from catalyst import metrics size = 4 half_size = size // 2 shape = (1, 1, size, size) empty = torch.zeros(shape) full = torch.ones(shape) left = torch.ones(shape) left[:, :, :, half_size:] = 0 right = torch.ones(shape) right[:, :, :, :half_size] = 0 top_left = torch.zeros(shape) top_left[:, :, :half_size, :half_size] = 1 pred = torch.cat([empty, left, empty, full, left, top_left], dim=1) targets = torch.cat([full, right, empty, full, left, left], dim=1) metrics.iou( outputs=pred, targets=targets, class_dim=1, threshold=0.5, mode="per-class" ) # tensor([0.0000, 0.0000, 1.0000, 1.0000, 1.0000, 0.5]) metrics.iou( outputs=pred, targets=targets, class_dim=1, threshold=0.5, mode="macro" ) # tensor(0.5833) metrics.iou( outputs=pred, targets=targets, class_dim=1, threshold=0.5, mode="micro" ) # tensor(0.4375)
-
catalyst.metrics.functional._segmentation.
dice
(outputs: torch.Tensor, targets: torch.Tensor, class_dim: int = 1, threshold: float = None, mode: str = 'per-class', weights: Optional[List[float]] = None, eps: float = 1e-07) → torch.Tensor[source]¶ Computes the dice score, dice score = 2 * intersection / (intersection + union)) = = 2 * tp / (2 * tp + fp + fn)
- Parameters
outputs – [N; K; …] tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.
targets – binary [N; K; …] tensor that encodes which of the K classes are associated with the N-th input
class_dim – indicates class dimention (K) for
outputs
andtargets
tensors (default = 1), if mode = “micro” means nothingthreshold – threshold for outputs binarization
mode – class summation strategy. Must be one of [‘micro’, ‘macro’, ‘weighted’, ‘per-class’]. If mode=’micro’, classes are ignored, and metric are calculated generally. If mode=’macro’, metric are calculated per-class and than are averaged over all classes. If mode=’weighted’, metric are calculated per-class and than summed over all classes with weights. If mode=’per-class’, metric are calculated separately for all classes
weights – class weights(for mode=”weighted”)
eps – epsilon to avoid zero division
- Returns
Dice score for each class(if mode=’weighted’) or aggregated Dice
Example:
import torch from catalyst import metrics size = 4 half_size = size // 2 shape = (1, 1, size, size) empty = torch.zeros(shape) full = torch.ones(shape) left = torch.ones(shape) left[:, :, :, half_size:] = 0 right = torch.ones(shape) right[:, :, :, :half_size] = 0 top_left = torch.zeros(shape) top_left[:, :, :half_size, :half_size] = 1 pred = torch.cat([empty, left, empty, full, left, top_left], dim=1) targets = torch.cat([full, right, empty, full, left, left], dim=1) metrics.dice( outputs=pred, targets=targets, class_dim=1, threshold=0.5, mode="per-class" ) # tensor([0.0000, 0.0000, 1.0000, 1.0000, 1.0000, 0.6667]) metrics.dice( outputs=pred, targets=targets, class_dim=1, threshold=0.5, mode="macro" ) # tensor(0.6111) metrics.dice( outputs=pred, targets=targets, class_dim=1, threshold=0.5, mode="micro" ) # tensor(0.6087)
-
catalyst.metrics.functional._segmentation.
trevsky
(outputs: torch.Tensor, targets: torch.Tensor, alpha: float, beta: Optional[float] = None, class_dim: int = 1, threshold: float = None, mode: str = 'per-class', weights: Optional[List[float]] = None, eps: float = 1e-07) → torch.Tensor[source]¶ Computes the trevsky score, trevsky score = tp / (tp + fp * beta + fn * alpha)
- Parameters
outputs – [N; K; …] tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.
targets – binary [N; K; …] tensor that encodes which of the K classes are associated with the N-th input
alpha – false negative coefficient, bigger alpha bigger penalty for false negative. Must be in (0, 1)
beta – false positive coefficient, bigger alpha bigger penalty for false positive. Must be in (0, 1), if None beta = (1 - alpha)
class_dim – indicates class dimention (K) for
outputs
andtargets
tensors (default = 1)threshold – threshold for outputs binarization
mode – class summation strategy. Must be one of [‘micro’, ‘macro’, ‘weighted’, ‘per-class’]. If mode=’micro’, classes are ignored, and metric are calculated generally. If mode=’macro’, metric are calculated per-class and than are averaged over all classes. If mode=’weighted’, metric are calculated per-class and than summed over all classes with weights. If mode=’per-class’, metric are calculated separately for all classes
weights – class weights(for mode=”weighted”)
eps – epsilon to avoid zero division
- Returns
Trevsky score for each class(if mode=’weighted’) or aggregated score
Example:
import torch from catalyst import metrics size = 4 half_size = size // 2 shape = (1, 1, size, size) empty = torch.zeros(shape) full = torch.ones(shape) left = torch.ones(shape) left[:, :, :, half_size:] = 0 right = torch.ones(shape) right[:, :, :, :half_size] = 0 top_left = torch.zeros(shape) top_left[:, :, :half_size, :half_size] = 1 pred = torch.cat([empty, left, empty, full, left, top_left], dim=1) targets = torch.cat([full, right, empty, full, left, left], dim=1) metrics.trevsky( outputs=pred, targets=targets, alpha=0.2, class_dim=1, threshold=0.5, mode="per-class" ) # tensor([0.0000, 0.0000, 1.0000, 1.0000, 1.0000, 0.8333]) metrics.trevsky( outputs=pred, targets=targets, alpha=0.2, class_dim=1, threshold=0.5, mode="macro" ) # tensor(0.6389) metrics.trevsky( outputs=pred, targets=targets, alpha=0.2, class_dim=1, threshold=0.5, mode="micro" ) # tensor(0.7000)
-
catalyst.metrics.functional._segmentation.
get_segmentation_statistics
(outputs: torch.Tensor, targets: torch.Tensor, class_dim: int = 1, threshold: float = None) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor][source]¶ Computes true positive, false positive, false negative for a multilabel segmentation problem.
- Parameters
outputs – [N; K; …] tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.
targets – binary [N; K; …] tensor that encodes which of the K classes are associated with the N-th input
class_dim – indicates class dimention (K) for
outputs
andtargets
tensors (default = 1)threshold – threshold for outputs binarization
- Returns
Segmentation stats
Example:
import torch from catalyst import metrics size = 4 half_size = size // 2 shape = (1, 1, size, size) empty = torch.zeros(shape) full = torch.ones(shape) left = torch.ones(shape) left[:, :, :, half_size:] = 0 right = torch.ones(shape) right[:, :, :, :half_size] = 0 top_left = torch.zeros(shape) top_left[:, :, :half_size, :half_size] = 1 pred = torch.cat([empty, left, empty, full, left, top_left], dim=1) targets = torch.cat([full, right, empty, full, left, left], dim=1) metrics.get_segmentation_statistics( outputs=pred, targets=targets, class_dim=1, threshold=0.5, ) # ( # tensor([ 0., 0., 0., 16., 8., 4.]), # per class TP # tensor([0., 8., 0., 0., 0., 0.]), # per class FP # tensor([16., 8., 0., 0., 0., 4.]), # per class TN # )
Misc¶
-
catalyst.metrics.functional._misc.
check_consistent_length
(*tensors)[source]¶ Check that all arrays have consistent first dimensions. Checks whether all objects in arrays have the same shape or length.
- Parameters
tensors – list or tensors of input objects. Objects that will be checked for consistent length.
- Raises
ValueError – “Inconsistent numbers of samples”
-
catalyst.metrics.functional._misc.
process_multilabel_components
(outputs: torch.Tensor, targets: torch.Tensor, weights: Optional[torch.Tensor] = None) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor][source]¶ General preprocessing for multilabel-based metrics.
- Parameters
outputs – NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model.
targets – binary NxK tensor that encodes which of the K classes are associated with the N-th input (eg: a row [0, 1, 0, 1] indicates that the example is associated with classes 2 and 4)
weights – importance for each sample
- Returns
processed
outputs
andtargets
with [batch_size; num_classes] shape
-
catalyst.metrics.functional._misc.
process_recsys_components
(outputs: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]¶ General pre-processing for calculation recsys metrics
- Parameters
outputs (torch.Tensor) – Tensor with predicted scores size: [batch_size, slate_length] model outputs, logits
targets (torch.Tensor) – Binary tensor with ground truth. 1 means the item is relevant for the user and 0 not relevant size: [batch_szie, slate_length] ground truth, labels
- Returns
targets tensor sorted by outputs
- Return type
targets_sorted_by_outputs (torch.Tensor)
-
catalyst.metrics.functional._misc.
get_binary_statistics
(outputs: torch.Tensor, targets: torch.Tensor, label: int = 1) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]¶ Computes the number of true negative, false positive, false negative, true positive and support for a binary classification problem for a given label.
- Parameters
outputs – estimated targets as predicted by a model with shape [bs; …, 1]
targets – ground truth (correct) target values with shape [bs; …, 1]
label – integer, that specifies label of interest for statistics compute
- Returns
stats
- Return type
Tuple[Tensor, Tensor, Tensor, Tensor, Tensor]
Example:
import torch from catalyst import metrics y_pred = torch.tensor([[0, 0, 1, 1, 0, 1, 0, 1]]) y_true = torch.tensor([[0, 1, 0, 1, 0, 0, 1, 1]]) tn, fp, fn, tp, support = metrics.get_binary_statistics(y_pred, y_true) # tensor(2) tensor(2) tensor(2) tensor(2) tensor(4)
-
catalyst.metrics.functional._misc.
get_multiclass_statistics
(outputs: torch.Tensor, targets: torch.Tensor, argmax_dim: int = -1, num_classes: Optional[int] = None) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]¶ Computes the number of true negative, false positive, false negative, true positive and support for a multiclass classification problem.
- Parameters
outputs – estimated targets as predicted by a model with shape [bs; …, (num_classes or 1)]
targets – ground truth (correct) target values with shape [bs; …, 1]
argmax_dim – int, that specifies dimension for argmax transformation in case of scores/probabilities in
outputs
num_classes – int, that specifies number of classes if it known
- Returns
stats
- Return type
Tuple[Tensor, Tensor, Tensor, Tensor, Tensor]
Example:
import torch from catalyst import metrics y_pred = torch.tensor([1, 2, 3, 0]) y_true = torch.tensor([1, 3, 4, 0]) tn, fp, fn, tp, support = metrics.get_multiclass_statistics(y_pred, y_true) # ( # tensor([3., 3., 3., 2., 3.]), # tensor([0., 0., 1., 1., 0.]), # tensor([0., 0., 0., 1., 1.]), # tensor([1., 1., 0., 0., 0.]), # tensor([1., 1., 0., 1., 1.]) # )
-
catalyst.metrics.functional._misc.
get_multilabel_statistics
(outputs: torch.Tensor, targets: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]¶ Computes the number of true negative, false positive, false negative, true positive and support for a multilabel classification problem.
- Parameters
outputs – estimated targets as predicted by a model with shape [bs; …, (num_classes or 1)]
targets – ground truth (correct) target values with shape [bs; …, 1]
- Returns
stats
- Return type
Tuple[Tensor, Tensor, Tensor, Tensor, Tensor]
Examples:
import torch from catalyst import metrics y_pred = torch.tensor([[0, 0, 1, 1], [0, 1, 0, 1]]) y_true = torch.tensor([[0, 1, 0, 1], [0, 0, 1, 1]]) tn, fp, fn, tp, support = metrics.get_multilabel_statistics(y_pred, y_true) # ( # tensor([2., 0., 0., 0.]), # tensor([0., 1., 1., 0.]), # tensor([0., 1., 1., 0.]), # tensor([0., 0., 0., 2.]), # tensor([0., 1., 1., 2.]), # )
import torch from catalyst import metrics y_pred = torch.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) y_true = torch.tensor([0, 1, 2]) tn, fp, fn, tp, support = metrics.get_multilabel_statistics(y_pred, y_true) # ( # tensor([2., 2., 2.]), # tensor([0., 0., 0.]), # tensor([0., 0., 0.]), # tensor([1., 1., 1.]), # tensor([1., 1., 1.]), # )
import torch from catalyst import metrics y_pred = torch.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) y_true = torch.nn.functional.one_hot(torch.tensor([0, 1, 2])) tn, fp, fn, tp, support = metrics.get_multilabel_statistics(y_pred, y_true) # ( # tensor([2., 2., 2.]), # tensor([0., 0., 0.]), # tensor([0., 0., 0.]), # tensor([1., 1., 1.]), # tensor([1., 1., 1.]), # )
-
catalyst.metrics.functional._misc.
get_default_topk_args
(num_classes: int) → Sequence[int][source]¶ Calculate list params for
Accuracy@k
andmAP@k
.- Parameters
num_classes – number of classes
- Returns
array of accuracy arguments
- Return type
iterable
Examples
>>> get_default_topk_args(num_classes=4) [1, 3]
>>> get_default_topk_args(num_classes=8) [1, 3, 5]