Utils

catalyst.utils.argparse.boolean_flag(parser: argparse.ArgumentParser, name: str, default: Optional[bool] = False, help: str = None, shorthand: str = None) → None[source]

Add a boolean flag to a parser inplace.

Parameters
  • parser (argparse.ArgumentParser) – parser to add the flag to

  • name (str) – argument name –<name> will enable the flag, while –no-<name> will disable it

  • default (bool, optional) – default value of the flag

  • help (str) – help string for the flag

  • shorthand (str) – shorthand string for the argument

Examples

>>> parser = argparse.ArgumentParser()
>>> boolean_flag(
>>>     parser, "flag", default=False, help="some flag", shorthand="f"
>>> )
catalyst.utils.callbacks.process_callbacks(callbacks: Union[list, collections.OrderedDict]) → collections.OrderedDict[source]

Creates an sequence of callbacks and sort them :param callbacks: either list of callbacks or ordered dict

Returns

sequence of callbacks sorted by callback order

catalyst.utils.checkpoint.load_checkpoint(filepath)[source]
catalyst.utils.checkpoint.pack_checkpoint(model=None, criterion=None, optimizer=None, scheduler=None, **kwargs)[source]
catalyst.utils.checkpoint.save_checkpoint(checkpoint: Dict, logdir: Union[pathlib.Path, str], suffix: str, is_best: bool = False, is_last: bool = False, special_suffix: str = '')[source]
catalyst.utils.checkpoint.unpack_checkpoint(checkpoint, model=None, criterion=None, optimizer=None, scheduler=None)[source]
catalyst.utils.compression.compress(data)[source]
catalyst.utils.compression.compress_if_needed(data)[source]
catalyst.utils.compression.decompress(data)[source]
catalyst.utils.compression.decompress_if_needed(data)[source]
catalyst.utils.compression.is_compressed(data)[source]
catalyst.utils.config.load_ordered_yaml(stream, Loader=<class 'yaml.loader.Loader'>, object_pairs_hook=<class 'collections.OrderedDict'>)[source]

Loads yaml config into OrderedDict

Parameters
  • stream – opened file with yaml

  • Loader – base class for yaml Loader

  • object_pairs_hook – type of mapping

Returns

configuration

Return type

dict

catalyst.utils.config.get_environment_vars() → Dict[str, Any][source]

Creates a dictionary with environment variables

Returns

environment variables

Return type

dict

catalyst.utils.config.dump_environment(experiment_config: Dict, logdir: str, configs_path: List[str] = None) → None[source]

Saves config, environment variables and package list in JSON into logdir

Parameters
  • experiment_config (dict) – experiment config

  • logdir (str) – path to logdir

  • configs_path – path(s) to config

catalyst.utils.config.parse_config_args(*, config, args, unknown_args)[source]
catalyst.utils.config.parse_args_uargs(args, unknown_args)[source]

Function for parsing configuration files

Parameters
  • args – recognized arguments

  • unknown_args – unrecognized arguments

Returns

updated arguments, dict with config

Return type

tuple

catalyst.utils.confusion_matrix.calculate_confusion_matrix_from_arrays(ground_truth: numpy.array, prediction: numpy.array, num_classes: int) → numpy.array[source]

Calculate confusion matrix for a given set of classes. if GT value is outside of the [0, num_classes) it is excluded.

Parameters
  • ground_truth

  • prediction

  • num_classes

Returns:

catalyst.utils.confusion_matrix.calculate_confusion_matrix_from_tensors(y_pred_logits: torch.Tensor, y_true: torch.Tensor) → numpy.array[source]
catalyst.utils.confusion_matrix.calculate_tp_fp_fn(confusion_matrix: numpy.array) → numpy.array[source]
catalyst.utils.dataset.create_dataframe(dataset: Dict[str, object], **dataframe_args) → pandas.core.frame.DataFrame[source]

Create pd.DataFrame from dict like {key: [values]}

Parameters
  • dataset – dict like {key: [values]}

  • **dataframe_args

    indexIndex or array-like

    Index to use for resulting frame. Will default to np.arange(n) if no indexing information part of input data and no index provided

    columnsIndex or array-like

    Column labels to use for resulting frame. Will default to np.arange(n) if no column labels are provided

    dtypedtype, default None

    Data type to force, otherwise infer

Returns

dataframe from giving dataset

Return type

pd.DataFrame

catalyst.utils.dataset.create_dataset(dirs: str, extension: str = None, process_fn: Callable[[str], object] = None, recursive: bool = False) → Dict[str, object][source]

Create dataset (dict like {key: [values]}) from vctk-like dataset:

dataset/
    cat/
        *.ext
    dog/
        *.ext
Parameters
  • dirs (str) – path to dirs, for example /home/user/data/**

  • extension (str) – data extension you are looking for

  • process_fn (Callable[[str], object]) – function(path_to_file) -> object process function for found files, by default

  • recursive (bool) – enables recursive globbing

Returns

dataset

Return type

dict

catalyst.utils.dataset.split_dataset_train_test(dataset: pandas.core.frame.DataFrame, **train_test_split_args) → Tuple[Dict[str, object], Dict[str, object]][source]

Split dataset in train and test parts.

Parameters
  • dataset – dict like dataset

  • **train_test_split_args

    test_sizefloat, int, or None (default is None)

    If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is automatically set to the complement of the train size. If train size is also None, test size is set to 0.25.

    train_sizefloat, int, or None (default is None)

    If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.

    random_stateint or RandomState

    Pseudo-random number generator state used for random sampling.

    stratifyarray-like or None (default is None)

    If not None, data is split in a stratified fashion, using this as the class labels.

Returns

train and test dicts

catalyst.utils.ddp.get_nn_from_ddp_module(model: torch.nn.modules.module.Module) → torch.nn.modules.module.Module[source]

Return a real model from a torch.nn.DataParallel, torch.nn.parallel.DistributedDataParallel, or apex.parallel.DistributedDataParallel.

Parameters

model – A model, or DataParallel wrapper.

Returns

A model

catalyst.utils.ddp.is_wrapped_with_ddp(model: torch.nn.modules.module.Module) → bool[source]

Checks whether model is wrapped with DataParallel/DistributedDataParallel.

catalyst.utils.dict.append_dict(dict1, dict2)[source]

Appends dict2 with the same keys as dict1 to dict1

catalyst.utils.dict.flatten_dict(dictionary: Dict[str, Any], parent_key: str = '', separator: str = '/') → collections.OrderedDict[source]

Make the given dictionary flatten

Parameters
  • dictionary (dict) – giving dictionary

  • parent_key (str, optional) – prefix nested keys with string parent_key

  • separator (str, optional) – delimiter between parent_key and key to use

Returns

ordered dictionary with flatten keys

Return type

collections.OrderedDict

catalyst.utils.dict.get_dictkey_auto_fn(key: Union[str, List[str], None]) → Callable[source]

Function generator for sub-dict preparation from dict based on predefined keys.

Parameters

key – keys

Returns

function

catalyst.utils.dict.get_key_dict(dictionary: dict, key: Union[str, List[str], None]) → Dict[source]

Takes sub-dict from dict by dict-mapping of keys.

Parameters
  • dictionary – dict

  • key – dict-mapping of keys

Returns

sub-dict

catalyst.utils.dict.get_key_list(dictionary: dict, key: Union[str, List[str], None]) → Dict[source]

Takes sub-dict from dict by list of keys.

Parameters
  • dictionary – dict

  • key – list of keys

Returns

sub-dict

catalyst.utils.dict.get_key_none(dictionary: dict, key: Union[str, List[str], None]) → Dict[source]

Takes whole dict. :param dictionary: dict :param key: none

Returns

dict

catalyst.utils.dict.get_key_str(dictionary: dict, key: Union[str, List[str], None]) → Any[source]

Takes value from dict by key.

Parameters
  • dictionary – dict

  • key – key

Returns

value

catalyst.utils.dict.merge_dicts(*dicts: dict) → dict[source]

Recursive dict merge. Instead of updating only top-level keys, merge_dicts recurses down into dicts nested to an arbitrary depth, updating keys.

Parameters

*dicts – several dictionaries to merge

Returns

deep-merged dictionary

Return type

dict

catalyst.utils.hash.get_hash(obj: Any) → str[source]

Creates unique hash from object following way: - Represent obj as sting recursively - Hash this string with sha256 hash function - encode hash with url-safe base64 encoding

Parameters

obj – object to hash

Returns

base64-encoded string

catalyst.utils.hash.get_short_hash(o) → str[source]
catalyst.utils.image.has_image_extension(uri) → bool[source]

Check that file has image extension

Parameters

uri (Union[str, pathlib.Path]) – The resource to load the file from

Returns

True if file has image extension, False otherwise

Return type

bool

catalyst.utils.image.imread(uri, grayscale: bool = False, expand_dims: bool = True, rootpath: Union[str, pathlib.Path] = None, **kwargs)[source]
Parameters
  • uri – {str, pathlib.Path, bytes, file}

  • resource to load the image from, e.g. a filename, pathlib.Path, (The) –

  • address or file object, see the docs for more info. (http) –

  • grayscale

  • expand_dims

  • rootpath

Returns:

catalyst.utils.image.mask_to_overlay_image(image: numpy.ndarray, masks: List[numpy.ndarray], threshold: float = 0, mask_strength: float = 0.5) → numpy.ndarray[source]

Draws every mask for with some color over image

Parameters
  • image (np.ndarray) – RGB image used as underlay for masks

  • masks (List[np.ndarray]) – list of masks

  • threshold (float) – threshold for masks binarization

  • mask_strength (float) – opacity of colorized masks

Returns

HxWx3 image with overlay

Return type

np.ndarray

catalyst.utils.image.mimread(uri, clip_range: Tuple[int, int] = None, expand_dims: bool = True, rootpath: Union[str, pathlib.Path] = None, **kwargs)[source]
Parameters
  • uri – {str, pathlib.Path, bytes, file}

  • resource to load the mask from, e.g. a filename, pathlib.Path, (The) –

  • address or file object, see the docs for more info. (http) –

  • clip_range (Tuple[int, int]) – lower and upper interval edges, image values outside the interval are clipped to the interval edges

  • expand_dims (bool) – if True, append channel axis to grayscale images

  • rootpath (Union[str, pathlib.Path]) – path to an image (allows to use relative path)

Returns

Image

Return type

np.ndarray

catalyst.utils.image.mimwrite_with_meta(uri, ims, meta, **kwargs)[source]
catalyst.utils.image.tensor_from_rgb_image(image: numpy.ndarray) → torch.Tensor[source]
catalyst.utils.image.tensor_to_ndimage(images: torch.Tensor, denormalize: bool = True, mean: Tuple[float, float, float] = (0.485, 0.456, 0.406), std: Tuple[float, float, float] = (0.229, 0.224, 0.225), move_channels_dim: bool = True, dtype=<class 'numpy.float32'>) → numpy.ndarray[source]

Convert float image(s) with standard normalization to np.ndarray with [0..1] when dtype is np.float32 and [0..255] when dtype is np.uint8.

Parameters
  • images (torch.Tensor) – [B]xCxHxW float tensor

  • denormalize (bool) – if True, multiply image(s) by std and add mean

  • mean (Tuple[float, float, float]) – per channel mean to add

  • std (Tuple[float, float, float]) – per channel std to multiply

  • move_channels_dim (bool) – if True, convert tensor to [B]xHxWxC format

  • dtype – result ndarray dtype. Only float32 and uint8 are supported.

Returns

[B]xHxWxC np.ndarray of dtype

catalyst.utils.initialization.bias_init_with_prob(prior_prob)[source]

Initialize conv/fc bias value according to giving probablity

catalyst.utils.initialization.constant_init(module, val, bias=0)[source]

Initialize the module with constant value

catalyst.utils.initialization.create_optimal_inner_init(nonlinearity: torch.nn.modules.module.Module, **kwargs) → Callable[[torch.nn.modules.module.Module], None][source]

Create initializer for inner layers based on their activation function (nonlinearity).

Parameters

nonlinearity – non-linear activation

catalyst.utils.initialization.kaiming_init(module, mode='fan_out', nonlinearity='relu', bias=0, distribution='normal')[source]

Initialize the module with he initialization

catalyst.utils.initialization.normal_init(module, mean=0, std=1, bias=0)[source]

Initialize the module with normal distribution

catalyst.utils.initialization.outer_init(layer: torch.nn.modules.module.Module) → None[source]

Initialization for output layers of policy and value networks typically used in deep reinforcement learning literature.

catalyst.utils.initialization.uniform_init(module, a=0, b=1, bias=0)[source]

Initialize the module with uniform distribution

catalyst.utils.initialization.xavier_init(module, gain=1, bias=0, distribution='normal')[source]

Initialize the module with xavier initialization

catalyst.utils.misc.args_are_not_none(*args: Optional[Any]) → bool[source]

Check that all arguments are not None :param *args: values :type *args: Any

Returns

True if all value were not None, False otherwise

Return type

bool

catalyst.utils.misc.copy_directory(input_dir: pathlib.Path, output_dir: pathlib.Path) → None[source]

Recursively copies the input directory

Parameters
  • input_dir (Path) – input directory

  • output_dir (Path) – output directory

catalyst.utils.misc.format_metric(name: str, value: float) → str[source]

Format metric. Metric will be returned in the scientific format if 4 decimal chars are not enough (metric value lower than 1e-4)

Parameters
  • name (str) – metric name

  • value (float) – value of metric

catalyst.utils.misc.get_utcnow_time(format: str = None) → str[source]

Return string with current utc time in chosen format

Parameters

format (str) – format string. if None “%y%m%d.%H%M%S” will be used.

Returns

formatted utc time string

Return type

str

catalyst.utils.misc.is_exception(ex: Any) → bool[source]

Check if the argument is of Exception type

catalyst.utils.misc.make_tuple(tuple_like)[source]

Creates a tuple if given tuple_like value isn’t list or tuple

Returns

tuple or list

catalyst.utils.misc.maybe_recursive_call(object_or_dict, method: str, recursive_args=None, recursive_kwargs=None, **kwargs)[source]

Calls the method recursively for the object_or_dict

Parameters
  • object_or_dict (Any) – some object or a dictinary of objects

  • method (str) – method name to call

  • recursive_args – list of arguments to pass to the method

  • recursive_kwargs – list of key-arguments to pass to the method

  • **kwargs – Arbitrary keyword arguments

catalyst.utils.misc.pairwise(iterable: Iterable[Any]) → Iterable[Any][source]

Iterate sequences by pairs

Parameters

iterable – Any iterable sequence

Returns

pairwise iterator

Examples

>>> for i in pairwise([1, 2, 5, -3]):
>>>     print(i)
(1, 2)
(2, 5)
(5, -3)
catalyst.utils.notebook.save_notebook(filepath: str, wait_period: float = 0.1, max_wait_time=1.0)[source]
catalyst.utils.numpy.dict2structed(array: Dict)[source]
catalyst.utils.numpy.geometric_cumsum(alpha, x)[source]

Calculate future accumulated sums for each element in a list with an exponential factor.

Given input data \(x_1, \dots, x_n\) # noqa: E501, W605 and exponential factor \(lpha\in [0, 1]\), # noqa: E501, W605 it returns an array \(y\) with the same length and each element is calculated as following

\[y_i = x_i + lpha x_{i+1} + lpha^2 x_{i+2} + \dots + lpha^{n-i-1}x_{n-1} + lpha^{n-i}x_{n} # noqa: E501, W605\]

Note

To gain the optimal runtime speed, we use scipy.signal.lfilter

Example

>>> geometric_cumsum(0.1, [[1, 1], [2, 2], [3, 3], [4, 4]])
array([[1.234, 1.234], [2.34 , 2.34 ], [3.4  , 3.4  ], [4.   , 4.   ]])
Parameters
  • alpha (float) – exponential factor between zero and one.

  • x (np.ndarray) – input data, [trajectory_len, num_atoms]

Returns

calculated data

Return type

out (np.ndarray)

source: https://github.com/zuoxingdong/lagom

catalyst.utils.numpy.get_one_hot(label: int, num_classes: int, smoothing: float = None) → numpy.ndarray[source]

Applies OneHot vectorization to a giving scalar, optional with label smoothing from https://arxiv.org/abs/1812.01187

Parameters
  • label (int) – scalar value to be vectorized

  • num_classes (int) – total number of classes

  • smoothing (float, optional) – if specified applies label smoothing from Bag of Tricks for Image Classification with Convolutional Neural Networks paper

Returns

a one-hot vector with shape (num_classes,)

Return type

np.ndarray

catalyst.utils.numpy.np_softmax(x)[source]
catalyst.utils.numpy.structed2dict(array: numpy.ndarray)[source]
catalyst.utils.pandas.balance_classes(dataframe: pandas.core.frame.DataFrame, class_column: str = 'label', random_state: int = 42, how: str = 'downsampling') → pandas.core.frame.DataFrame[source]

Balance classes in dataframe by class_column

See also catalyst.data.sampler.BalanceClassSampler

Parameters
  • dataframe – a dataset

  • class_column – which column to use for split

  • random_state – seed for random shuffle

  • how – strategy to sample must be one on [“downsampling”, “upsampling”]

Returns

new dataframe with balanced class_column

Return type

pd.DataFrame

catalyst.utils.pandas.dataframe_to_list(dataframe: pandas.core.frame.DataFrame) → List[dict][source]

Converts dataframe to a list of rows (without indexes)

Parameters

dataframe (DataFrame) – input dataframe

Returns

list of rows

Return type

(List[dict])

catalyst.utils.pandas.folds_to_list(folds: Union[list, str, pandas.core.series.Series]) → List[int][source]

This function formats string or either list of numbers into a list of unique int

Parameters

folds (Union[list, str, pd.Series]) – Either list of numbers or one string with numbers separated by commas or pandas series

Returns

list of unique ints

Return type

List[int]

Examples

>>> folds_to_list("1,2,1,3,4,2,4,6")
[1, 2, 3, 4, 6]
>>> folds_to_list([1, 2, 3.0, 5])
[1, 2, 3, 5]
Raises

ValueError – if value in string or array cannot be casted to int

catalyst.utils.pandas.get_dataset_labeling(dataframe: pandas.core.frame.DataFrame, tag_column: str) → Dict[str, int][source]

Prepares a mapping using unique values from tag_column

{
    "class_name_0": 0,
    "class_name_1": 1,
    ...
    "class_name_N": N
}
Parameters
  • dataframe – a dataset

  • tag_column – which column to use

Returns

mapping from tag to labels

Return type

Dict[str, int]

catalyst.utils.pandas.map_dataframe(dataframe: pandas.core.frame.DataFrame, tag_column: str, class_column: str, tag2class: Dict[str, int], verbose: bool = False) → pandas.core.frame.DataFrame[source]

This function maps tags from tag_column to ints into class_column Using tag2class dictionary

Parameters
  • dataframe (pd.DataFrame) – input dataframe

  • tag_column (str) – column with tags

  • class_column (str) –

  • tag2class (Dict[str, int]) – mapping from tags to class labels

  • verbose – flag if true, uses tqdm

Returns

updated dataframe with class_column

Return type

pd.DataFrame

catalyst.utils.pandas.merge_multiple_fold_csv(fold_name: str, paths: Optional[str]) → pandas.core.frame.DataFrame[source]

Reads csv into one DataFrame with column fold :param fold_name: current fold name :type fold_name: str :param paths: paths to csv separated by commas :type paths: str

Returns

merged dataframes with column fold == fold_name

Return type

pd.DataFrame

catalyst.utils.pandas.read_csv_data(in_csv: str = None, train_folds: Optional[List[int]] = None, valid_folds: Optional[List[int]] = None, infer_folds: Optional[List[int]] = None, seed: int = 42, n_folds: int = 5, in_csv_train: str = None, in_csv_valid: str = None, in_csv_infer: str = None, tag2class: Optional[Dict[str, int]] = None, class_column: str = None, tag_column: str = None) → Tuple[pandas.core.frame.DataFrame, List[dict], List[dict], List[dict]][source]

From giving path in_csv reads a dataframe and split it to train/valid/infer folds or from several paths in_csv_train, in_csv_valid, in_csv_infer reads independent folds.

Note

This function can be used with different combinations of params.
First block is used to get dataset from one csv:

in_csv, train_folds, valid_folds, infer_folds, seed, n_folds

Second includes paths to different csv for train/valid and infer parts:

in_csv_train, in_csv_valid, in_csv_infer

The other params (tag2class, tag_column, class_column) are optional

for any previous block

Parameters
  • in_csv (str) – paths to whole dataset

  • train_folds (List[int]) – train folds

  • valid_folds (List[int], optional) – valid folds. If none takes all folds not included in train_folds

  • infer_folds (List[int], optional) – infer folds. If none takes all folds not included in train_folds and valid_folds

  • seed (int) – seed for split

  • n_folds (int) – number of folds

  • in_csv_train (str) – paths to train csv separated by commas

  • in_csv_valid (str) – paths to valid csv separated by commas

  • in_csv_infer (str) – paths to infer csv separated by commas

  • tag2class (Dict[str, int]) – mapping from label names into ints

  • tag_column (str) – column with label names

  • class_column (str) – column to use for split

Returns

tuple with 4 elements (whole dataframe, list with train data, list with valid data and list with infer data)

Return type

(Tuple[pd.DataFrame, List[dict], List[dict], List[dict]])

catalyst.utils.pandas.read_multiple_dataframes(in_csv_train: str = None, in_csv_valid: str = None, in_csv_infer: str = None, tag2class: Optional[Dict[str, int]] = None, class_column: str = None, tag_column: str = None) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]

This function reads train/valid/infer dataframes from giving paths :param in_csv_train: paths to train csv separated by commas :type in_csv_train: str :param in_csv_valid: paths to valid csv separated by commas :type in_csv_valid: str :param in_csv_infer: paths to infer csv separated by commas :type in_csv_infer: str :param tag2class: mapping from label names into int :type tag2class: Dict[str, int], optional :param tag_column: column with label names :type tag_column: str, optional :param class_column: column to use for split :type class_column: str, optional

Returns

tuple with 4 dataframes

whole dataframe, train part, valid part and infer part

Return type

(tuple)

catalyst.utils.pandas.separate_tags(dataframe: pandas.core.frame.DataFrame, tag_column: str = 'tag', tag_delim: str = ', ') → pandas.core.frame.DataFrame[source]

Separates values in class_column column

Parameters
  • dataframe – a dataset

  • tag_column – column name to separate values

  • tag_delim – delimiter to separate values

Returns

new dataframe

Return type

pd.DataFrame

catalyst.utils.pandas.split_dataframe(dataframe: pandas.core.frame.DataFrame, train_folds: List[int], valid_folds: Optional[List[int]] = None, infer_folds: Optional[List[int]] = None, tag2class: Optional[Dict[str, int]] = None, tag_column: str = None, class_column: str = None, seed: int = 42, n_folds: int = 5) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]

Split a Pandas DataFrame into folds.

Parameters
  • dataframe (pd.DataFrame) – input dataframe

  • train_folds (List[int]) – train folds

  • valid_folds (List[int], optional) – valid folds. If none takes all folds not included in train_folds

  • infer_folds (List[int], optional) – infer folds. If none takes all folds not included in train_folds and valid_folds

  • tag2class (Dict[str, int], optional) – mapping from label names into int

  • tag_column (str, optional) – column with label names

  • class_column (str, optional) – column to use for split

  • seed (int) – seed for split

  • n_folds (int) – number of folds

Returns

tuple with 4 dataframes

whole dataframe, train part, valid part and infer part

Return type

(tuple)

catalyst.utils.pandas.split_dataframe_on_column_folds(dataframe: pandas.core.frame.DataFrame, column: str, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]

Splits DataFrame into N folds.

Parameters
  • dataframe – a dataset

  • column – which column to use

  • random_state – seed for random shuffle

  • n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.utils.pandas.split_dataframe_on_folds(dataframe: pandas.core.frame.DataFrame, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]

Splits DataFrame into N folds.

Parameters
  • dataframe – a dataset

  • random_state – seed for random shuffle

  • n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.utils.pandas.split_dataframe_on_stratified_folds(dataframe: pandas.core.frame.DataFrame, class_column: str, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]

Splits DataFrame into N stratified folds.

Also see catalyst.data.sampler.BalanceClassSampler

Parameters
  • dataframe – a dataset

  • class_column – which column to use for split

  • random_state – seed for random shuffle

  • n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.utils.pandas.split_dataframe_train_test(dataframe: pandas.core.frame.DataFrame, **train_test_split_args) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]

Split dataframe in train and test part.

Parameters
  • dataframe – pd.DataFrame to split

  • **train_test_split_args

    test_sizefloat, int, or None (default is None)

    If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is automatically set to the complement of the train size. If train size is also None, test size is set to 0.25.

    train_sizefloat, int, or None (default is None)

    If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.

    random_stateint or RandomState

    Pseudo-random number generator state used for random sampling.

    stratifyarray-like or None (default is None)

    If not None, data is split in a stratified fashion, using this as the class labels.

Returns

train and test DataFrames

PS. It exist cause sklearn split is overcomplicated.

class catalyst.utils.parallel.DumbPool[source]

Bases: object

imap_unordered(func, args)[source]
catalyst.utils.parallel.get_pool(workers: int) → Union[multiprocessing.pool.Pool, catalyst.utils.parallel.DumbPool][source]
catalyst.utils.parallel.parallel_imap(func, args, pool: Union[multiprocessing.pool.Pool, catalyst.utils.parallel.DumbPool]) → List[T][source]
catalyst.utils.parallel.tqdm_parallel_imap(func, args, pool: Union[multiprocessing.pool.Pool, catalyst.utils.parallel.DumbPool], total: int = None, pbar=<class 'tqdm.std.tqdm'>) → List[T][source]
catalyst.utils.plotly.plot_tensorboard_log(logdir: Union[str, pathlib.Path], step: Optional[str] = 'batch', metrics: Optional[List[str]] = None, height: Optional[int] = None, width: Optional[int] = None) → None[source]
catalyst.utils.scripts.import_module(expdir: pathlib.Path)[source]
catalyst.utils.scripts.dump_code(expdir, logdir)[source]
catalyst.utils.scripts.dump_python_files(src, dst)[source]
catalyst.utils.scripts.import_experiment_and_runner(expdir: pathlib.Path)[source]
catalyst.utils.scripts.dump_base_experiment_code(src: pathlib.Path, dst: pathlib.Path)[source]
catalyst.utils.seed.set_global_seed(seed: int) → None[source]

Sets random seed into PyTorch, TensorFlow, Numpy and Random.

Parameters

seed – random seed

catalyst.utils.serialization.deserialize(data)

Deserialize bytes into an object using pickle

Parameters

bytes – a bytes object containing serialized with pickle data.

Returns

Returns a value deserialized from the bytes-like object.

catalyst.utils.serialization.pickle_deserialize(data)[source]

Deserialize bytes into an object using pickle

Parameters

bytes – a bytes object containing serialized with pickle data.

Returns

Returns a value deserialized from the bytes-like object.

catalyst.utils.serialization.pickle_serialize(data)[source]

Serialize the data into bytes using pickle

Parameters

data – a value

Returns

Returns a bytes object serialized with pickle data.

catalyst.utils.serialization.pyarrow_deserialize(data)[source]

Deserialize bytes into an object using pyarrow

Parameters

bytes – a bytes object containing serialized with pyarrow data.

Returns

Returns a value deserialized from the bytes-like object.

catalyst.utils.serialization.pyarrow_serialize(data)[source]

Serialize the data into bytes using pyarrow

Parameters

data – a value

Returns

Returns a bytes object serialized with pyarrow data.

catalyst.utils.serialization.serialize(data)

Serialize the data into bytes using pickle

Parameters

data – a value

Returns

Returns a bytes object serialized with pickle data.

catalyst.utils.torch.ce_with_logits(logits, target)[source]

Returns cross entropy for giving logits

catalyst.utils.torch.log1p_exp(x)[source]

Computationally stable function for computing log(1+exp(x)).

catalyst.utils.torch.normal_sample(mu, sigma)[source]

Sample from multivariate Gaussian distribution z ~ N(z|mu,sigma) while supporting backpropagation through its mean and variance.

catalyst.utils.torch.normal_logprob(mu, sigma, z)[source]

Probability density function of multivariate Gaussian distribution N(z|mu,sigma).

catalyst.utils.torch.soft_update(target, source, tau)[source]

Updates the target data with smoothing by tau

catalyst.utils.torch.get_optimizable_params(model_or_params)[source]

Returns all the parameters that requires gradients

catalyst.utils.torch.get_optimizer_momentum(optimizer: torch.optim.optimizer.Optimizer) → float[source]

Get momentum of current optimizer.

Parameters

optimizer – PyTorch optimizer

Returns

momentum at first param group

Return type

float

catalyst.utils.torch.set_optimizer_momentum(optimizer: torch.optim.optimizer.Optimizer, value: float, index: int = 0)[source]

Set momentum of index ‘th param group of optimizer to value

Parameters
  • optimizer – PyTorch optimizer

  • value (float) – new value of momentum

  • index (int, optional) – integer index of optimizer’s param groups, default is 0

catalyst.utils.torch.assert_fp16_available() → None[source]

Asserts for installed and available Apex FP16

catalyst.utils.torch.get_device() → torch.device[source]

Simple returning the best available device (GPU or CPU)

catalyst.utils.torch.get_available_gpus()[source]

Array of available GPU ids :returns: available GPU ids :rtype: iterable

Examples

>>> os.environ["CUDA_VISIBLE_DEVICES"] = "0,2"
>>> get_available_gpus()
>>> [0, 2]
>>> os.environ["CUDA_VISIBLE_DEVICES"] = "0,-1,1"
>>> get_available_gpus()
>>> [0]
>>> os.environ["CUDA_VISIBLE_DEVICES"] = ""
>>> get_available_gpus()
>>> []
>>> os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
>>> get_available_gpus()
>>> []
catalyst.utils.torch.get_activation_fn(activation: str = None)[source]

Returns the activation function from torch.nn by its name

catalyst.utils.torch.any2device(value, device: Union[str, torch.device])[source]

Move tensor, list of tensors, list of list of tensors, dict of tensors, tuple of tensors to target device.

Parameters
  • value – Object to be moved

  • device (Device) – target device ids

Returns

Same structure as value, but all tensors and np.arrays moved to device

catalyst.utils.torch.prepare_cudnn(deterministic: bool = None, benchmark: bool = None) → None[source]

Prepares CuDNN benchmark and sets CuDNN to be deterministic/non-deterministic mode

Parameters
  • deterministic (bool) – deterministic mode if running in CuDNN backend.

  • benchmark (bool) – If True use CuDNN heuristics to figure out which algorithm will be most performant for your model architecture and input. Setting it to False may slow down your training.

catalyst.utils.torch.process_model_params(model: torch.nn.modules.module.Module, layerwise_params: Dict[str, dict] = None, no_bias_weight_decay: bool = True, lr_scaling: float = 1.0) → List[Union[torch.nn.parameter.Parameter, dict]][source]

Gains model parameters for torch.optim.Optimizer

Parameters
  • model (torch.nn.Module) – Model to process

  • layerwise_params (Dict) – Order-sensitive dict where each key is regex pattern and values are layer-wise options for layers matching with a pattern

  • no_bias_weight_decay (bool) – If true, removes weight_decay for all bias parameters in the model

  • lr_scaling (float) – layer-wise learning rate scaling, if 1.0, learning rates will not be scaled

Returns

parameters for an optimizer

Return type

iterable

Examples

>>> model = catalyst.contrib.models.segmentation.ResnetUnet()
>>> layerwise_params = collections.OrderedDict([
>>>     ("conv1.*", dict(lr=0.001, weight_decay=0.0003)),
>>>     ("conv.*", dict(lr=0.002))
>>> ])
>>> params = process_model_params(model, layerwise_params)
>>> optimizer = torch.optim.Adam(params, lr=0.0003)
catalyst.utils.torch.set_requires_grad(model: torch.nn.modules.module.Module, requires_grad: bool)[source]

Sets the requires_grad value for all model parameters.

Parameters
  • model (torch.nn.Module) – Model

  • requires_grad (bool) – value

Examples

>>> model = SimpleModel()
>>> set_requires_grad(model, requires_grad=True)
catalyst.utils.torch.get_network_output(net: torch.nn.modules.module.Module, *input_shapes)[source]

For each input shape returns an output tensor

Parameters
  • net (Model) – the model

  • *args – variable length argument list of shapes

catalyst.utils.torch.detach(tensor: torch.Tensor) → numpy.ndarray[source]

Detaches the input tensor to a numpy array

catalyst.utils.visualization.plot_confusion_matrix(cm, class_names=None, normalize=False, title='confusion matrix', fname=None, show=True, figsize=12, fontsize=32, colormap='Blues')[source]

Render the confusion matrix and return matplotlib”s figure with it. Normalization can be applied by setting normalize=True.

catalyst.utils.visualization.render_figure_to_tensor(figure)[source]

Tools

class catalyst.utils.tools.metric_manager.TimerManager[source]

Bases: object

reset() → None[source]

Reset all previous timers

start(name: str) → None[source]

Starts timer name :param name: name of a timer :type name: str

stop(name: str) → None[source]

Stops timer name :param name: name of a timer :type name: str

class catalyst.utils.tools.metric_manager.MetricManager(valid_loader: str = 'valid', main_metric: str = 'loss', minimize: bool = True, batch_consistant_metrics: bool = True)[source]

Bases: object

add_batch_value(name: str = None, value: Any = None, metrics_dict: Dict[str, Any] = None)[source]
property batch_values
begin_batch()[source]
begin_epoch()[source]
begin_loader(name: str)[source]
end_batch()[source]
end_epoch_train()[source]
end_loader()[source]
property main_metric_value
class catalyst.utils.tools.dynamic_array.DynamicArray(array_or_shape=(None, ), dtype=None, capacity=64, allow_views_on_resize=False)[source]

Bases: object

Dynamically growable numpy array. Credits to https://github.com/maciejkula/dynarray

Parameters
  • array_or_shape (numpy array or tuple) – If an array, a growable array with the same shape, dtype, and a copy of the data will be created. The array will grow along the first dimension. If a tuple, en empty array of the specified shape will be created. The first element needs to be None to denote that the array will grow along the first dimension.

  • dtype (optional, array dtype) – The dtype the array should have.

  • capacity (int, optional) – The initial capacity of the array.

  • allow_views_on_resize (boolean, optional) – If False, an exception will be thrown if the array is resized while there are live references to the array”s contents. When the array is resized, these will point at old data. Set to True if you want to silence the exception.

Examples

Create a multidimensional array and append rows:

>>> from dynarray import DynamicArray
>>> # The leading dimension is None to denote that this is
>>> # the dynamic dimension
>>> array = DynamicArray((None, 20, 10))
>>> array.append(np.random.random((20, 10)))
>>> array.extend(np.random.random((100, 20, 10)))

Slice and perform arithmetic like with normal numpy arrays:

>>> array[:2]
MAGIC_METHODS = ('__radd__', '__add__', '__sub__', '__rsub__', '__mul__', '__rmul__', '__div__', '__rdiv__', '__pow__', '__rpow__', '__eq__', '__len__')
__init__(array_or_shape=(None, ), dtype=None, capacity=64, allow_views_on_resize=False)[source]

Init

append(value)[source]

Append a row to the array.

The row’s shape has to match the array’s trailing dimensions.

property capacity

Capacity of the array

property dtype

Dtype of the array

extend(values)[source]

Extend the array with a set of rows.

The rows” dimensions must match the trailing dimensions of the array.

property shape

Shape of the array

shrink_to_fit()[source]

Reduce the array”s capacity to its size.

class catalyst.utils.tools.frozen_class.FrozenClass[source]

Bases: object

Class which prohibit __setattr__ on existing attributes

Examples

>>> class State(FrozenClass):
class catalyst.utils.tools.registry.Registry(default_name_key: str, default_meta_factory: Callable[[Union[Type, Callable[[...], Any]], Tuple, Mapping], Any] = <function _default_meta_factory>)[source]

Bases: collections.abc.MutableMapping

Universal class allowing to add and access various factories by name

__init__(default_name_key: str, default_meta_factory: Callable[[Union[Type, Callable[[...], Any]], Tuple, Mapping], Any] = <function _default_meta_factory>)[source]
Parameters
  • default_name_key (str) – Default key containing factory name when creating from config

  • default_meta_factory (MetaFactory) – default object that calls factory. Optional. Default just calls factory.

add(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]][source]

Adds factory to registry with it’s __name__ attribute or provided name. Signature is flexible.

Parameters
  • factory – Factory instance

  • factories – More instances

  • name – Provided name for first instance. Use only when pass single instance.

  • named_factories – Factory and their names as kwargs

Returns

First factory passed

Return type

(Factory)

add_from_module(module, prefix: Union[str, List[str]] = None) → None[source]

Adds all factories present in module. If __all__ attribute is present, takes ony what mentioned in it

Parameters
  • module – module to scan

  • prefix (Union[str, List[str]]) – prefix string for all the module’s factories. If prefix is a list, all values will be treated as aliases.

all() → List[str][source]
Returns

list of names of registered items

get(name: str) → Union[Type, Callable[[...], Any], None][source]

Retrieves factory, without creating any objects with it or raises error

Parameters

name – factory name

Returns

factory by name

Return type

Factory

get_from_params(*, meta_factory=None, **kwargs) → Union[Any, Tuple[Any, Mapping[str, Any]]][source]

Creates instance based in configuration dict with instantiation_fn. If config[name_key] is None, None is returned.

Parameters
  • meta_factory – Function that calls factory the right way. If not provided, default is used.

  • **kwargs – additional kwargs for factory

Returns

result of calling instantiate_fn(factory, **config)

get_if_str(obj: Union[str, Type, Callable[[...], Any]])[source]

Returns object from the registry if obj type is string

get_instance(name: str, *args, meta_factory=None, **kwargs)[source]

Creates instance by calling specified factory with instantiate_fn

Parameters
  • name – factory name

  • meta_factory – Function that calls factory the right way. If not provided, default is used

  • args – args to pass to the factory

  • **kwargs – kwargs to pass to the factory

Returns

created instance

late_add(cb: Callable[[Registry], None])[source]

Allows to prevent cycle imports by delaying some imports till next registry query

Parameters

cb – Callback receives registry and must call it’s methods to register factories

len() → int[source]
Returns

length of registered items

exception catalyst.utils.tools.registry.RegistryException(message)[source]

Bases: Exception

Exception class for all registry errors

__init__(message)[source]

Init

class catalyst.utils.tools.seeder.Seeder(init_seed: int = 0, max_seed: int = None)[source]

Bases: object

A random seed generator.

Given an initial seed, the seeder can be called continuously to sample a single or a batch of random seeds.

Note

The seeder creates an independent RandomState to generate random numbers. It does not affect the RandomState in np.random.

Example::
>>> seeder = Seeder(init_seed=0)
>>> seeder(size=5)
[209652396, 398764591, 924231285, 1478610112, 441365315]
__init__(init_seed: int = 0, max_seed: int = None)[source]

Initialize the seeder.

Parameters

init_seed (int, optional) – Initial seed for generating random seeds. Default: 0.

exception catalyst.utils.tools.tensorboard.EventReadingError[source]

Bases: Exception

An exception that correspond to an event file reading error

class catalyst.utils.tools.tensorboard.EventsFileReader(events_file: BinaryIO)[source]

Bases: collections.abc.Iterable

An iterator over a Tensorboard events file

__init__(events_file: BinaryIO)[source]

Initialize an iterator over an events file

Parameters

events_file – An opened file-like object.

class catalyst.utils.tools.tensorboard.SummaryItem(tag, step, wall_time, value, type)

Bases: tuple

property step

Alias for field number 1

property tag

Alias for field number 0

property type

Alias for field number 4

property value

Alias for field number 3

property wall_time

Alias for field number 2

class catalyst.utils.tools.tensorboard.SummaryReader(logdir: Union[str, pathlib.Path], tag_filter: Optional[collections.abc.Iterable] = None, types: collections.abc.Iterable = ('scalar',))[source]

Bases: collections.abc.Iterable

Iterates over events in all the files in the current logdir. Only scalars and images are supported at the moment.

__init__(logdir: Union[str, pathlib.Path], tag_filter: Optional[collections.abc.Iterable] = None, types: collections.abc.Iterable = ('scalar',))[source]

Initalize new summary reader

Parameters
  • logdir – A directory with Tensorboard summary data

  • tag_filter – A list of tags to leave (None for all)

  • types – A list of types to get.

  • "scalar" and "image" types are allowed at the moment. (Only) –

catalyst.utils.tools.typing.Model

alias of torch.nn.modules.module.Module

catalyst.utils.tools.typing.Criterion

alias of torch.nn.modules.module.Module

class catalyst.utils.tools.typing.Optimizer(params, defaults)[source]

Bases: object

Base class for all optimizers.

Warning

Parameters need to be specified as collections that have a deterministic ordering that is consistent between runs. Examples of objects that don’t satisfy those properties are sets and iterators over values of dictionaries.

Parameters
  • params (iterable) – an iterable of torch.Tensor s or dict s. Specifies what Tensors should be optimized.

  • defaults – (dict): a dict containing default values of optimization options (used when a parameter group doesn’t specify them).

add_param_group(param_group)[source]

Add a param group to the Optimizer s param_groups.

This can be useful when fine tuning a pre-trained network as frozen layers can be made trainable and added to the Optimizer as training progresses.

Parameters
  • param_group (dict) – Specifies what Tensors should be optimized along with group

  • optimization options. (specific) –

load_state_dict(state_dict)[source]

Loads the optimizer state.

Parameters

state_dict (dict) – optimizer state. Should be an object returned from a call to state_dict().

state_dict()[source]

Returns the state of the optimizer as a dict.

It contains two entries:

  • state - a dict holding current optimization state. Its content

    differs between optimizer classes.

  • param_groups - a dict containing all parameter groups

step(closure)[source]

Performs a single optimization step (parameter update).

Parameters

closure (callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.

zero_grad()[source]

Clears the gradients of all optimized torch.Tensor s.

catalyst.utils.tools.typing.Scheduler

alias of torch.optim.lr_scheduler._LRScheduler

class catalyst.utils.tools.typing.Dataset[source]

Bases: object

An abstract class representing a Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite __getitem__(), supporting fetching a data sample for a given key. Subclasses could also optionally overwrite __len__(), which is expected to return the size of the dataset by many Sampler implementations and the default options of DataLoader.

Note

DataLoader by default constructs a index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

Criterion

catalyst.utils.criterion.accuracy.accuracy(outputs, targets, topk=(1, ), threshold: float = None, activation: str = None)[source]

Computes the accuracy.

It can be used either for:
  • multi-class task:

    -you can use topk. -threshold and activation are not required. -targets is a tensor: batch_size -outputs is a tensor: batch_size x num_classes -computes the accuracy@k for the specified values of k.

  • OR multi-label task, in this case:

    -you must specify threshold and activation -topk will not be used (because of there is no method to apply top-k in multi-label classification). -outputs, targets are tensors with shape: batch_size x num_classes -targets is a tensor with binary vectors

catalyst.utils.criterion.accuracy.average_accuracy(outputs, targets, k=10)[source]

Computes the average accuracy at k. This function computes the average accuracy at k between two lists of items.

Parameters
  • outputs (list) – A list of predicted elements

  • targets (list) – A list of elements that are to be predicted

  • k (int, optional) – The maximum number of predicted elements

Returns

The average accuracy at k over the input lists

Return type

double

catalyst.utils.criterion.accuracy.mean_average_accuracy(outputs, targets, topk=(1, ))[source]

Computes the mean average accuracy at k. This function computes the mean average accuracy at k between two lists of lists of items.

Parameters
  • outputs (list) – A list of lists of predicted elements

  • targets (list) – A list of lists of elements that are to be predicted

  • topk (int, optional) – The maximum number of predicted elements

Returns

The mean average accuracy at k over the input lists

Return type

double

catalyst.utils.criterion.dice.dice(outputs: torch.Tensor, targets: torch.Tensor, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]

Computes the dice metric

Parameters
  • outputs (list) – A list of predicted elements

  • targets (list) – A list of elements that are to be predicted

  • eps (float) – epsilon

  • threshold (float) – threshold for outputs binarization

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of [“none”, “Sigmoid”, “Softmax2d”]

Returns

Dice score

Return type

double

catalyst.utils.criterion.f1_score.f1_score(outputs: torch.Tensor, targets: torch.Tensor, beta: float = 1.0, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]

Source https://github.com/qubvel/segmentation_models.pytorch

Parameters
  • outputs (torch.Tensor) – A list of predicted elements

  • targets (torch.Tensor) – A list of elements that are to be predicted

  • eps (float) – epsilon to avoid zero division

  • beta (float) – beta param for f_score

  • threshold (float) – threshold for outputs binarization

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of [“none”, “Sigmoid”, “Softmax2d”]

Returns

F_1 score

Return type

float

catalyst.utils.criterion.focal.sigmoid_focal_loss(outputs: torch.Tensor, targets: torch.Tensor, gamma: float = 2.0, alpha: float = 0.25, reduction: str = 'mean')[source]

Compute binary focal loss between target and output logits.

Source https://github.com/BloodAxe/pytorch-toolbelt See losses for details.

Parameters
  • outputs – Tensor of arbitrary shape

  • targets – Tensor of the same shape as input

  • reduction (string, optional) – Specifies the reduction to apply to the output: “none” | “mean” | “sum” | “batchwise_mean”. “none”: no reduction will be applied, “mean”: the sum of the output will be divided by the number of elements in the output, “sum”: the output will be summed.

See https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/loss/losses.py # noqa: E501

catalyst.utils.criterion.focal.reduced_focal_loss(outputs: torch.Tensor, targets: torch.Tensor, threshold: float = 0.5, gamma: float = 2.0, reduction='mean')[source]

Compute reduced focal loss between target and output logits.

Source https://github.com/BloodAxe/pytorch-toolbelt See losses for details.

Parameters
  • outputs – Tensor of arbitrary shape

  • targets – Tensor of the same shape as input

  • reduction (string, optional) –

    Specifies the reduction to apply to the output: “none” | “mean” | “sum” | “batchwise_mean”.

    ”none”: no reduction will be applied, “mean”: the sum of the output will be divided by the number of elements in the output, “sum”: the output will be summed.

    Note: size_average and reduce are in the process of being deprecated, and in the meantime, specifying either of those two args will override reduction.

    ”batchwise_mean” computes mean loss per sample in batch. Default: “mean”

See https://arxiv.org/abs/1903.01347

catalyst.utils.criterion.iou.iou(outputs: torch.Tensor, targets: torch.Tensor, classes: List[str] = None, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid') → Union[float, List[float]][source]
Parameters
  • outputs (torch.Tensor) – A list of predicted elements

  • targets (torch.Tensor) – A list of elements that are to be predicted

  • eps (float) – epsilon to avoid zero division

  • threshold (float) – threshold for outputs binarization

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of [“none”, “Sigmoid”, “Softmax2d”]

Returns

IoU (Jaccard) score(s)

Return type

Union[float, List[float]]

catalyst.utils.criterion.iou.jaccard(outputs: torch.Tensor, targets: torch.Tensor, classes: List[str] = None, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid') → Union[float, List[float]]
Parameters
  • outputs (torch.Tensor) – A list of predicted elements

  • targets (torch.Tensor) – A list of elements that are to be predicted

  • eps (float) – epsilon to avoid zero division

  • threshold (float) – threshold for outputs binarization

  • activation (str) – An torch.nn activation applied to the outputs. Must be one of [“none”, “Sigmoid”, “Softmax2d”]

Returns

IoU (Jaccard) score(s)

Return type

Union[float, List[float]]

Meters

The meters from torchnet.meters

class catalyst.utils.meters.meter.Meter[source]

Bases: object

Meters provide a way to keep track of important statistics in an online manner.

This class is abstract, but provides a standard interface for all meters to follow.

add(value)[source]

Log a new value to the meter

Parameters

value – Next restult to include.

reset()[source]

Resets the meter to default settings.

value()[source]

Get the value of the meter in the current state.

class catalyst.utils.meters.apmeter.APMeter[source]

Bases: catalyst.utils.meters.meter.Meter

The APMeter measures the average precision per class.

The APMeter is designed to operate on NxK Tensors output and target, and optionally a Nx1 Tensor weight where (1) the output contains model output scores for N examples and K classes that ought to be higher when the model is more convinced that the example should be positively labeled, and smaller when the model believes the example should be negatively labeled (for instance, the output of a sigmoid function); (2) the target contains only values 0 (for negative examples) and 1 (for positive examples); and (3) the weight ( > 0) represents weight for each sample.

add(output, target, weight=None)[source]

Add a new observation

Parameters
  • output (Tensor) – NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model. The probabilities should sum to one over all classes

  • target (Tensor) – binary NxK tensort that encodes which of the K classes are associated with the N-th input (eg: a row [0, 1, 0, 1] indicates that the example is associated with classes 2 and 4)

  • weight (optional, Tensor) – Nx1 tensor representing the weight for each example (each weight > 0)

reset()[source]

Resets the meter with empty member variables

value()[source]

Returns the model”s average precision for each class

Returns

1xK tensor, with avg precision for each class k

Return type

ap (FloatTensor)

class catalyst.utils.meters.aucmeter.AUCMeter[source]

Bases: catalyst.utils.meters.meter.Meter

The AUCMeter measures the area under the receiver-operating characteristic (ROC) curve for binary classification problems. The area under the curve (AUC) can be interpreted as the probability that, given a randomly selected positive example and a randomly selected negative example, the positive example is assigned a higher score by the classification model than the negative example.

The AUCMeter is designed to operate on one-dimensional Tensors output and target, where (1) the output contains model output scores that ought to be higher when the model is more convinced that the example should be positively labeled, and smaller when the model believes the example should be negatively labeled (for instance, the output of a signoid function); and (2) the target contains only values 0 (for negative examples) and 1 (for positive examples).

add(output, target)[source]
reset()[source]
value()[source]
class catalyst.utils.meters.averagevaluemeter.AverageValueMeter[source]

Bases: catalyst.utils.meters.meter.Meter

add(value, n=1)[source]
reset()[source]
value()[source]
class catalyst.utils.meters.classerrormeter.ClassErrorMeter(topk=[1], accuracy=False)[source]

Bases: catalyst.utils.meters.meter.Meter

add(output, target)[source]
reset()[source]
value(k=-1)[source]
class catalyst.utils.meters.confusionmeter.ConfusionMeter(k, normalized=False)[source]

Bases: catalyst.utils.meters.meter.Meter

Maintains a confusion matrix for a given calssification problem.

The ConfusionMeter constructs a confusion matrix for a multi-class classification problems. It does not support multi-label, multi-class problems: for such problems, please use MultiLabelConfusionMeter.

Parameters
  • k (int) – number of classes in the classification problem

  • normalized (boolean) – Determines whether or not the confusion matrix is normalized or not

add(predicted, target)[source]

Computes the confusion matrix of K x K size where K is no of classes

Args: predicted (tensor): Can be an N x K tensor of predicted scores obtained from the model for N examples and K classes or an N-tensor of integer values between 0 and K-1. target (tensor): Can be a N-tensor of integer values assumed to be integer values between 0 and K-1 or N x K tensor, where targets are assumed to be provided as one-hot vectors

reset()[source]
value()[source]
Returns

Confustion matrix of K rows and K columns, where rows corresponds to ground-truth targets and columns corresponds to predicted targets.

class catalyst.utils.meters.mapmeter.mAPMeter[source]

Bases: catalyst.utils.meters.meter.Meter

The mAPMeter measures the mean average precision over all classes.

The mAPMeter is designed to operate on NxK Tensors output and target, and optionally a Nx1 Tensor weight where (1) the output contains model output scores for N examples and K classes that ought to be higher when the model is more convinced that the example should be positively labeled, and smaller when the model believes the example should be negatively labeled (for instance, the output of a sigmoid function); (2) the target contains only values 0 (for negative examples) and 1 (for positive examples); and (3) the weight ( > 0) represents weight for each sample.

add(output, target, weight=None)[source]
reset()[source]
value()[source]
class catalyst.utils.meters.movingaveragevaluemeter.MovingAverageValueMeter(windowsize)[source]

Bases: catalyst.utils.meters.meter.Meter

add(value)[source]
reset()[source]
value()[source]
class catalyst.utils.meters.msemeter.MSEMeter(root=False)[source]

Bases: catalyst.utils.meters.meter.Meter

add(output, target)[source]
reset()[source]
value()[source]