Utils¶

catalyst.utils.argparse.boolean_flag(parser: argparse.ArgumentParser, name: str, default: Optional[bool] = False, help: str = None, shorthand: str = None) → None[source]¶

Add a boolean flag to a parser inplace.

Parameters

parser (argparse.ArgumentParser) – parser to add the flag to
name (str) – argument name –<name> will enable the flag, while –no-<name> will disable it
default (bool, optional) – default value of the flag
help (str) – help string for the flag
shorthand (str) – shorthand string for the argument

Examples

>>> parser = argparse.ArgumentParser()
>>> boolean_flag(
>>>     parser, "flag", default=False, help="some flag", shorthand="f"
>>> )

catalyst.utils.checkpoint.load_checkpoint(filepath)[source]¶

catalyst.utils.checkpoint.pack_checkpoint(model=None, criterion=None, optimizer=None, scheduler=None, **kwargs)[source]¶

catalyst.utils.checkpoint.save_checkpoint(checkpoint: Dict, logdir: Union[pathlib.Path, str], suffix: str, is_best: bool = False, is_last: bool = False, special_suffix: str = '')[source]¶

catalyst.utils.checkpoint.unpack_checkpoint(checkpoint, model=None, criterion=None, optimizer=None, scheduler=None)[source]¶

catalyst.utils.compression.compress(data)[source]¶

catalyst.utils.compression.compress_if_needed(data)[source]¶

catalyst.utils.compression.decompress(data)[source]¶

catalyst.utils.compression.decompress_if_needed(data)[source]¶

catalyst.utils.compression.is_compressed(data)[source]¶

catalyst.utils.config.load_ordered_yaml(stream, Loader=<class 'yaml.loader.Loader'>, object_pairs_hook=<class 'collections.OrderedDict'>)[source]¶

Loads yaml config into OrderedDict

Parameters

stream – opened file with yaml
Loader – base class for yaml Loader
object_pairs_hook – type of mapping

Returns

configuration

Return type

dict

catalyst.utils.config.get_environment_vars() → Dict[str, Any][source]¶

Creates a dictionary with environment variables

Returns: environment variables
Return type: dict

catalyst.utils.config.dump_environment(experiment_config: Dict, logdir: str, configs_path: List[str] = None) → None[source]¶

Saves config, environment variables and package list in JSON into logdir

Parameters

experiment_config (dict) – experiment config
logdir (str) – path to logdir
configs_path – path(s) to config

catalyst.utils.config.parse_config_args(*, config, args, unknown_args)[source]¶

catalyst.utils.config.parse_args_uargs(args, unknown_args)[source]¶

Function for parsing configuration files

Parameters

args – recognized arguments
unknown_args – unrecognized arguments

Returns

updated arguments, dict with config

Return type

tuple

catalyst.utils.dataset.balance_classes(dataframe: pandas.core.frame.DataFrame, class_column: str = 'label', random_state: int = 42, how: str = 'downsampling') → pandas.core.frame.DataFrame[source]¶

Balance classes in dataframe by class_column

Parameters

dataframe – a dataset
class_column – which column to use for split
random_state – seed for random shuffle
how – strategy to sample must be one on [“downsampling”, “upsampling”]

Returns

new dataframe with balanced class_column

Return type

pd.DataFrame

catalyst.utils.dataset.column_fold_split(dataframe: pandas.core.frame.DataFrame, column: str, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]¶

Splits DataFrame into N folds.

Parameters

dataframe – a dataset
column – which column to use
random_state – seed for random shuffle
n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.utils.dataset.create_dataframe(dataset: Dict[str, object], **dataframe_args) → pandas.core.frame.DataFrame[source]¶

Create pd.DataFrame from dict like {key: [values]}

Parameters

dataset – dict like {key: [values]}
**dataframe_args –

indexIndex or array-like
Index to use for resulting frame. Will default to np.arange(n) if no indexing information part of input data and no index provided

columnsIndex or array-like
Column labels to use for resulting frame. Will default to np.arange(n) if no column labels are provided

dtypedtype, default None
Data type to force, otherwise infer

Returns

dataframe from giving dataset

Return type

pd.DataFrame

catalyst.utils.dataset.create_dataset(dirs: str, extension: str = None, process_fn: Callable[[str], object] = None, recursive: bool = False) → Dict[str, object][source]¶

Create dataset (dict like {key: [values]}) from vctk-like dataset:

dataset/
    cat/
        *.ext
    dog/
        *.ext

Parameters

dirs (str) – path to dirs, for example /home/user/data/**
extension (str) – data extension you are looking for
process_fn (Callable[[str], object]) – function(path_to_file) -> object process function for found files, by default
recursive (bool) – enables recursive globbing

Returns

dataset

Return type

dict

catalyst.utils.dataset.default_fold_split(dataframe: pandas.core.frame.DataFrame, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]¶

Splits DataFrame into N folds.

Parameters

dataframe – a dataset
random_state – seed for random shuffle
n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.utils.dataset.prepare_dataset_labeling(dataframe: pandas.core.frame.DataFrame, class_column: str) → Dict[str, int][source]¶

Prepares a mapping using unique values from class_column

{
    "class_name_0": 0,
    "class_name_1": 1,
    ...
    "class_name_N": N
}

Parameters

dataframe – a dataset
class_column – which column to use

Returns

mapping from tag to labels

Return type

Dict[str, int]

catalyst.utils.dataset.separate_tags(dataframe: pandas.core.frame.DataFrame, tag_column: str = 'label', tag_delim: str = '-') → pandas.core.frame.DataFrame[source]¶

Separates values in class_column column

Parameters

dataframe – a dataset
tag_column – column name to separate values
tag_delim – delimiter to separate values

Returns

new dataframe

Return type

pd.DataFrame

catalyst.utils.dataset.split_dataframe(dataframe: pandas.core.frame.DataFrame, **train_test_split_args) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]¶

Split dataframe in train and test part.

Parameters

dataframe – pd.DataFrame to split
**train_test_split_args –

test_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is automatically set to the complement of the train size. If train size is also None, test size is set to 0.25.

train_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.

random_stateint or RandomState
Pseudo-random number generator state used for random sampling.

stratifyarray-like or None (default is None)
If not None, data is split in a stratified fashion, using this as the class labels.

Returns

train and test DataFrames

PS. It exist cause sklearn split is overcomplicated.

catalyst.utils.dataset.split_dataset(dataset: pandas.core.frame.DataFrame, **train_test_split_args) → Tuple[Dict[str, object], Dict[str, object]][source]¶

Split dataset in train and test parts.

Parameters

dataset – dict like dataset
**train_test_split_args –

test_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is automatically set to the complement of the train size. If train size is also None, test size is set to 0.25.

train_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.

random_stateint or RandomState
Pseudo-random number generator state used for random sampling.

stratifyarray-like or None (default is None)
If not None, data is split in a stratified fashion, using this as the class labels.

Returns

train and test dicts

catalyst.utils.dataset.stratified_fold_split(dataframe: pandas.core.frame.DataFrame, class_column: str, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]¶

Splits DataFrame into N stratified folds.

Also see catalyst.data.sampler.BalanceClassSampler

Parameters

dataframe – a dataset
class_column – which column to use for split
random_state – seed for random shuffle
n_folds – number of result folds

Returns

new dataframe with fold column

Return type

pd.DataFrame

catalyst.utils.ddp.get_real_module(model: torch.nn.modules.module.Module) → torch.nn.modules.module.Module[source]¶

Return a real model from a torch.nn.DataParallel, torch.nn.parallel.DistributedDataParallel, or apex.parallel.DistributedDataParallel.

Parameters: model – A model, or DataParallel wrapper.
Returns: A model

catalyst.utils.ddp.is_wrapped_with_ddp(model: torch.nn.modules.module.Module) → bool[source]¶: Checks whether model is wrapped with DataParallel/DistributedDataParallel.

class catalyst.utils.frozen.FrozenClass[source]¶

Class which prohibit __setattr__ on existing attributes

Examples

>>> class RunnerState(FrozenClass):

catalyst.utils.hash.get_hash(obj: Any) → str[source]¶

Creates unique hash from object following way: - Represent obj as sting recursively - Hash this string with sha256 hash function - encode hash with url-safe base64 encoding

Parameters: obj – object to hash
Returns: base64-encoded string

catalyst.utils.hash.get_short_hash(o) → str[source]¶

catalyst.utils.image.has_image_extension(uri) → bool[source]¶

Check that file has image extension

Parameters: uri (Union[str, pathlib.Path]) – The resource to load the file from
Returns: True if file has image extension, False otherwise
Return type: bool

catalyst.utils.image.imread(uri, grayscale: bool = False, expand_dims: bool = True, rootpath: Union[str, pathlib.Path] = None, **kwargs)[source]¶

Parameters

uri – {str, pathlib.Path, bytes, file}
resource to load the image from, e.g. a filename, pathlib.Path, (The) –
address or file object, see the docs for more info. (http) –
grayscale –
expand_dims –
rootpath –

Returns:

catalyst.utils.image.mask_to_overlay_image(image: numpy.ndarray, masks: List[numpy.ndarray], threshold: float = 0, mask_strength: float = 0.5) → numpy.ndarray[source]¶

Draws every mask for with some color over image

Parameters

image (np.ndarray) – RGB image used as underlay for masks
masks (List[np.ndarray]) – list of masks
threshold (float) – threshold for masks binarization
mask_strength (float) – opacity of colorized masks

Returns

HxWx3 image with overlay

Return type

np.ndarray

catalyst.utils.image.mimread(uri, clip_range: Tuple[int, int] = None, expand_dims: bool = True, rootpath: Union[str, pathlib.Path] = None, **kwargs)[source]¶

Parameters

uri – {str, pathlib.Path, bytes, file}
resource to load the mask from, e.g. a filename, pathlib.Path, (The) –
address or file object, see the docs for more info. (http) –
clip_range (Tuple[int, int]) – lower and upper interval edges, image values outside the interval are clipped to the interval edges
expand_dims (bool) – if True, append channel axis to grayscale images
rootpath (Union[str, pathlib.Path]) – path to an image (allows to use relative path)

Returns

Image

Return type

np.ndarray

catalyst.utils.image.mimwrite_with_meta(uri, ims, meta, **kwargs)[source]¶

catalyst.utils.image.tensor_from_rgb_image(image: numpy.ndarray) → torch.Tensor[source]¶

catalyst.utils.image.tensor_to_ndimage(images: torch.Tensor, mean: Tuple[float, float, float] = (0.485, 0.456, 0.406), std: Tuple[float, float, float] = (0.229, 0.224, 0.225), dtype=<class 'numpy.float32'>) → numpy.ndarray[source]¶

Convert float image(s) with standard normalization to np.ndarray with [0..1] when dtype is np.float32 and [0..255] when dtype is np.uint8.

Parameters

images – [B]xCxHxW float tensor
mean – mean to add
std – std to multiply
dtype – result ndarray dtype. Only float32 and uint8 are supported.

Returns

[B]xHxWxC np.ndarray of dtype

catalyst.utils.initialization.bias_init_with_prob(prior_prob)[source]¶: Initialize conv/fc bias value according to giving probablity

catalyst.utils.initialization.constant_init(module, val, bias=0)[source]¶: Initialize the module with constant value

catalyst.utils.initialization.create_optimal_inner_init(nonlinearity: torch.nn.modules.module.Module, **kwargs) → Callable[[torch.nn.modules.module.Module], None][source]¶

Create initializer for inner layers based on their activation function (nonlinearity).

Parameters: nonlinearity – non-linear activation

catalyst.utils.initialization.kaiming_init(module, mode='fan_out', nonlinearity='relu', bias=0, distribution='normal')[source]¶: Initialize the module with he initialization

catalyst.utils.initialization.normal_init(module, mean=0, std=1, bias=0)[source]¶: Initialize the module with normal distribution

catalyst.utils.initialization.outer_init(layer: torch.nn.modules.module.Module) → None[source]¶: Initialization for output layers of policy and value networks typically used in deep reinforcement learning literature.

catalyst.utils.initialization.uniform_init(module, a=0, b=1, bias=0)[source]¶: Initialize the module with uniform distribution

catalyst.utils.initialization.xavier_init(module, gain=1, bias=0, distribution='normal')[source]¶: Initialize the module with xavier initialization

catalyst.utils.misc.append_dict(dict1, dict2)[source]¶: Appends dict2 with the same keys as dict1 to dict1

catalyst.utils.misc.args_are_not_none(*args: Optional[Any]) → bool[source]¶

Check that all arguments are not None :param *args: values :type *args: Any

Returns: True if all value were not None, False otherwise
Return type: bool

catalyst.utils.misc.copy_directory(input_dir: pathlib.Path, output_dir: pathlib.Path) → None[source]¶

Recursively copies the input directory

Parameters

input_dir (Path) – input directory
output_dir (Path) – output directory

catalyst.utils.misc.flatten_dict(dictionary: Dict[str, Any], parent_key: str = '', separator: str = '/') → collections.OrderedDict[source]¶

Make the given dictionary flatten

Parameters

dictionary (dict) – giving dictionary
parent_key (str, optional) – prefix nested keys with string parent_key
separator (str, optional) – delimiter between parent_key and key to use

Returns

ordered dictionary with flatten keys

Return type

collections.OrderedDict

catalyst.utils.misc.format_metric(name: str, value: float) → str[source]¶

Format metric. Metric will be returned in the scientific format if 4 decimal chars are not enough (metric value lower than 1e-4)

Parameters

name (str) – metric name
value (float) – value of metric

catalyst.utils.misc.get_utcnow_time(format: str = None) → str[source]¶

Return string with current utc time in chosen format

Parameters: format (str) – format string. if None “%y%m%d.%H%M%S” will be used.
Returns: formatted utc time string
Return type: str

catalyst.utils.misc.is_exception(ex: Any) → bool[source]¶: Check if the argument is of Exception type

catalyst.utils.misc.make_tuple(tuple_like)[source]¶

Creates a tuple if given tuple_like value isn’t list or tuple

Returns: tuple or list

catalyst.utils.misc.maybe_recursive_call(object_or_dict, method: str, recursive_args=None, recursive_kwargs=None, **kwargs)[source]¶

Calls the method recursively for the object_or_dict

Parameters

object_or_dict (Any) – some object or a dictinary of objects
method (str) – method name to call
recursive_args – list of arguments to pass to the method
recursive_kwargs – list of key-arguments to pass to the method
**kwargs – Arbitrary keyword arguments

catalyst.utils.misc.merge_dicts(*dicts: dict) → dict[source]¶

Recursive dict merge. Instead of updating only top-level keys, merge_dicts recurses down into dicts nested to an arbitrary depth, updating keys.

Parameters: *dicts – several dictionaries to merge
Returns: deep-merged dictionary
Return type: dict

catalyst.utils.misc.pairwise(iterable: Iterable[Any]) → Iterable[Any][source]¶

Iterate sequences by pairs

Parameters: iterable – Any iterable sequence
Returns: pairwise iterator

Examples

>>> for i in pairwise([1, 2, 5, -3]):
>>>     print(i)
(1, 2)
(2, 5)
(5, -3)

catalyst.utils.numpy.dict2structed(array: Dict)[source]¶

catalyst.utils.numpy.geometric_cumsum(alpha, x)[source]¶

Calculate future accumulated sums for each element in a list with an exponential factor.

Given input data \(x_1, \dots, x_n\) # noqa: E501, W605 and exponential factor \(lpha\in [0, 1]\), # noqa: E501, W605 it returns an array \(y\) with the same length and each element is calculated as following

\[y_i = x_i + lpha x_{i+1} + lpha^2 x_{i+2} + \dots + lpha^{n-i-1}x_{n-1} + lpha^{n-i}x_{n} # noqa: E501, W605\]

Note

To gain the optimal runtime speed, we use scipy.signal.lfilter

Example

>>> geometric_cumsum(0.1, [[1, 1], [2, 2], [3, 3], [4, 4]])
array([[1.234, 1.234], [2.34 , 2.34 ], [3.4  , 3.4  ], [4.   , 4.   ]])

Parameters

alpha (float) – exponential factor between zero and one.
x (np.ndarray) – input data, [trajectory_len, num_atoms]

Returns

calculated data

Return type

out (np.ndarray)

source: https://github.com/zuoxingdong/lagom

catalyst.utils.numpy.get_one_hot(label: int, num_classes: int, smoothing: float = None) → numpy.ndarray[source]¶

Applies OneHot vectorization to a giving scalar, optional with label smoothing from https://arxiv.org/abs/1812.01187

Parameters

label (int) – scalar value to be vectorized
num_classes (int) – total number of classes
smoothing (float, optional) – if specified applies label smoothing from Bag of Tricks for Image Classification with Convolutional Neural Networks paper

Returns

a one-hot vector with shape (num_classes,)

Return type

np.ndarray

catalyst.utils.numpy.np_softmax(x)[source]¶

catalyst.utils.numpy.structed2dict(array: numpy.ndarray)[source]¶

catalyst.utils.pandas.dataframe_to_list(dataframe: pandas.core.frame.DataFrame) → List[dict][source]¶

Converts dataframe to a list of rows (without indexes)

Parameters: dataframe (DataFrame) – input dataframe
Returns: list of rows
Return type: (List[dict])

catalyst.utils.pandas.folds_to_list(folds: Union[list, str, pandas.core.series.Series]) → List[int][source]¶

This function formats string or either list of numbers into a list of unique int

Parameters: folds (Union[list, str, pd.Series]) – Either list of numbers or one string with numbers separated by commas or pandas series
Returns: list of unique ints
Return type: List[int]

Examples

>>> folds_to_list("1,2,1,3,4,2,4,6")
[1, 2, 3, 4, 6]
>>> folds_to_list([1, 2, 3.0, 5])
[1, 2, 3, 5]

Raises: ValueError – if value in string or array cannot be casted to int

catalyst.utils.pandas.map_dataframe(dataframe: pandas.core.frame.DataFrame, tag_column: str, class_column: str, tag2class: Dict[str, int], verbose: bool = False) → pandas.core.frame.DataFrame[source]¶

This function maps tags from tag_column to ints into class_column Using tag2class dictionary

Parameters

dataframe (pd.DataFrame) – input dataframe
tag_column (str) – column with tags
class_column (str) –
tag2class (Dict[str, int]) – mapping from tags to class labels
verbose – flag if true, uses tqdm

Returns

updated dataframe with class_column

Return type

pd.DataFrame

catalyst.utils.pandas.merge_multiple_fold_csv(fold_name: str, paths: Optional[str]) → pandas.core.frame.DataFrame[source]¶

Reads csv into one DataFrame with column fold :param fold_name: current fold name :type fold_name: str :param paths: paths to csv separated by commas :type paths: str

Returns: merged dataframes with column fold == fold_name
Return type: pd.DataFrame

catalyst.utils.pandas.read_csv_data(in_csv: str = None, train_folds: Optional[List[int]] = None, valid_folds: Optional[List[int]] = None, infer_folds: Optional[List[int]] = None, seed: int = 42, n_folds: int = 5, in_csv_train: str = None, in_csv_valid: str = None, in_csv_infer: str = None, tag2class: Optional[Dict[str, int]] = None, class_column: str = None, tag_column: str = None) → Tuple[pandas.core.frame.DataFrame, List[dict], List[dict], List[dict]][source]¶

From giving path in_csv reads a dataframe and split it to train/valid/infer folds or from several paths in_csv_train, in_csv_valid, in_csv_infer reads independent folds.

Note

This function can be used with different combinations of params.

First block is used to get dataset from one csv:: in_csv, train_folds, valid_folds, infer_folds, seed, n_folds
Second includes paths to different csv for train/valid and infer parts:: in_csv_train, in_csv_valid, in_csv_infer
The other params (tag2class, tag_column, class_column) are optional: for any previous block

Parameters

in_csv (str) – paths to whole dataset
train_folds (List[int]) – train folds
valid_folds (List[int], optional) – valid folds. If none takes all folds not included in train_folds
infer_folds (List[int], optional) – infer folds. If none takes all folds not included in train_folds and valid_folds
seed (int) – seed for split
n_folds (int) – number of folds
in_csv_train (str) – paths to train csv separated by commas
in_csv_valid (str) – paths to valid csv separated by commas
in_csv_infer (str) – paths to infer csv separated by commas
tag2class (Dict[str, int]) – mapping from label names into ints
tag_column (str) – column with label names
class_column (str) – column to use for split

Returns

tuple with 4 elements (whole dataframe, list with train data, list with valid data and list with infer data)

Return type

(Tuple[pd.DataFrame, List[dict], List[dict], List[dict]])

catalyst.utils.pandas.read_multiple_dataframes(in_csv_train: str = None, in_csv_valid: str = None, in_csv_infer: str = None, tag2class: Optional[Dict[str, int]] = None, class_column: str = None, tag_column: str = None) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]¶

This function reads train/valid/infer dataframes from giving paths :param in_csv_train: paths to train csv separated by commas :type in_csv_train: str :param in_csv_valid: paths to valid csv separated by commas :type in_csv_valid: str :param in_csv_infer: paths to infer csv separated by commas :type in_csv_infer: str :param tag2class: mapping from label names into int :type tag2class: Dict[str, int], optional :param tag_column: column with label names :type tag_column: str, optional :param class_column: column to use for split :type class_column: str, optional

Returns

tuple with 4 dataframes: whole dataframe, train part, valid part and infer part

Return type

(tuple)

catalyst.utils.pandas.split_dataframe(dataframe: pandas.core.frame.DataFrame, train_folds: List[int], valid_folds: Optional[List[int]] = None, infer_folds: Optional[List[int]] = None, tag2class: Optional[Dict[str, int]] = None, tag_column: str = None, class_column: str = None, seed: int = 42, n_folds: int = 5) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]¶

Split a Pandas DataFrame into folds.

Parameters

dataframe (pd.DataFrame) – input dataframe
train_folds (List[int]) – train folds
valid_folds (List[int], optional) – valid folds. If none takes all folds not included in train_folds
infer_folds (List[int], optional) – infer folds. If none takes all folds not included in train_folds and valid_folds
tag2class (Dict[str, int], optional) – mapping from label names into int
tag_column (str, optional) – column with label names
class_column (str, optional) – column to use for split
seed (int) – seed for split
n_folds (int) – number of folds

Returns

tuple with 4 dataframes: whole dataframe, train part, valid part and infer part

Return type

(tuple)

catalyst.utils.plotly.plot_tensorboard_log(logdir: Union[str, pathlib.Path], step: Optional[str] = 'batch', metrics: Optional[List[str]] = None, height: Optional[int] = None, width: Optional[int] = None) → None[source]¶

catalyst.utils.scripts.import_module(expdir: pathlib.Path)[source]¶

catalyst.utils.scripts.dump_code(expdir, logdir)[source]¶

catalyst.utils.scripts.dump_python_files(src, dst)[source]¶

class catalyst.utils.seed.Seeder(init_seed: int = 0, max_seed: int = None)[source]¶

A random seed generator.

Given an initial seed, the seeder can be called continuously to sample a single or a batch of random seeds.

Note

The seeder creates an independent RandomState to generate random numbers. It does not affect the RandomState in np.random.

Example::

>>> seeder = Seeder(init_seed=0)
>>> seeder(size=5)
[209652396, 398764591, 924231285, 1478610112, 441365315]

__init__(init_seed: int = 0, max_seed: int = None)[source]¶

Initialize the seeder.

Parameters: init_seed (int, optional) – Initial seed for generating random seeds. Default: 0.

catalyst.utils.seed.set_global_seed(seed: int) → None[source]¶

Sets random seed into PyTorch, TensorFlow, Numpy and Random.

Parameters: seed – random seed

catalyst.utils.serialization.deserialize(data)¶

Deserialize bytes into an object using pickle

Parameters: bytes – a bytes object containing serialized with pickle data.
Returns: Returns a value deserialized from the bytes-like object.

catalyst.utils.serialization.pickle_deserialize(data)[source]¶

Deserialize bytes into an object using pickle

Parameters: bytes – a bytes object containing serialized with pickle data.
Returns: Returns a value deserialized from the bytes-like object.

catalyst.utils.serialization.pickle_serialize(data)[source]¶

Serialize the data into bytes using pickle

Parameters: data – a value
Returns: Returns a bytes object serialized with pickle data.

catalyst.utils.serialization.pyarrow_deserialize(data)[source]¶

Deserialize bytes into an object using pyarrow

Parameters: bytes – a bytes object containing serialized with pyarrow data.
Returns: Returns a value deserialized from the bytes-like object.

catalyst.utils.serialization.pyarrow_serialize(data)[source]¶

Serialize the data into bytes using pyarrow

Parameters: data – a value
Returns: Returns a bytes object serialized with pyarrow data.

catalyst.utils.serialization.serialize(data)¶

Serialize the data into bytes using pickle

Parameters: data – a value
Returns: Returns a bytes object serialized with pickle data.

exception catalyst.utils.tensorboard.EventReadingError[source]¶: An exception that correspond to an event file reading error

class catalyst.utils.tensorboard.EventsFileReader(events_file: BinaryIO)[source]¶

An iterator over a Tensorboard events file

__init__(events_file: BinaryIO)[source]¶

Initialize an iterator over an events file

Parameters: events_file – An opened file-like object.

class catalyst.utils.tensorboard.SummaryItem(tag, step, wall_time, value, type)¶

property step¶: Alias for field number 1

property tag¶: Alias for field number 0

property type¶: Alias for field number 4

property value¶: Alias for field number 3

property wall_time¶: Alias for field number 2

class catalyst.utils.tensorboard.SummaryReader(logdir: Union[str, pathlib.Path], tag_filter: Optional[collections.abc.Iterable] = None, types: collections.abc.Iterable = ('scalar',))[source]¶

Iterates over events in all the files in the current logdir. Only scalars and images are supported at the moment.

__init__(logdir: Union[str, pathlib.Path], tag_filter: Optional[collections.abc.Iterable] = None, types: collections.abc.Iterable = ('scalar',))[source]¶

Initalize new summary reader

Parameters

logdir – A directory with Tensorboard summary data
tag_filter – A list of tags to leave (None for all)
types – A list of types to get.
"scalar" and "image" types are allowed at the moment. (Only) –

catalyst.utils.torch.ce_with_logits(logits, target)[source]¶: Returns cross entropy for giving logits

catalyst.utils.torch.log1p_exp(x)[source]¶: Computationally stable function for computing log(1+exp(x)).

catalyst.utils.torch.normal_sample(mu, sigma)[source]¶: Sample from multivariate Gaussian distribution z ~ N(z|mu,sigma) while supporting backpropagation through its mean and variance.

catalyst.utils.torch.normal_logprob(mu, sigma, z)[source]¶: Probability density function of multivariate Gaussian distribution N(z|mu,sigma).

catalyst.utils.torch.soft_update(target, source, tau)[source]¶: Updates the target data with smoothing by tau

catalyst.utils.torch.get_optimizable_params(model_or_params)[source]¶: Returns all the parameters that requires gradients

catalyst.utils.torch.get_optimizer_momentum(optimizer: torch.optim.optimizer.Optimizer) → float[source]¶

Get momentum of current optimizer.

Parameters: optimizer – PyTorch optimizer
Returns: momentum at first param group
Return type: float

catalyst.utils.torch.set_optimizer_momentum(optimizer: torch.optim.optimizer.Optimizer, value: float, index: int = 0)[source]¶

Set momentum of index ‘th param group of optimizer to value

Parameters

optimizer – PyTorch optimizer
value (float) – new value of momentum
index (int, optional) – integer index of optimizer’s param groups, default is 0

catalyst.utils.torch.assert_fp16_available() → None[source]¶: Asserts for installed and available Apex FP16

catalyst.utils.torch.get_device() → torch.device[source]¶: Simple returning the best available device (GPU or CPU)

catalyst.utils.torch.get_available_gpus()[source]¶

Array of available GPU ids :returns: available GPU ids :rtype: iterable

Examples

>>> os.environ["CUDA_VISIBLE_DEVICES"] = "0,2"
>>> get_available_gpus()
>>> [0, 2]

>>> os.environ["CUDA_VISIBLE_DEVICES"] = "0,-1,1"
>>> get_available_gpus()
>>> [0]

>>> os.environ["CUDA_VISIBLE_DEVICES"] = ""
>>> get_available_gpus()
>>> []

>>> os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
>>> get_available_gpus()
>>> []

catalyst.utils.torch.get_activation_fn(activation: str = None)[source]¶: Returns the activation function from torch.nn by its name

catalyst.utils.torch.any2device(value, device: Union[str, torch.device])[source]¶

Move tensor, list of tensors, list of list of tensors, dict of tensors, tuple of tensors to target device.

Parameters

value – Object to be moved
device (Device) – target device ids

Returns

Same structure as value, but all tensors and np.arrays moved to device

catalyst.utils.torch.prepare_cudnn(deterministic: bool = None, benchmark: bool = None) → None[source]¶

Prepares CuDNN benchmark and sets CuDNN to be deterministic/non-deterministic mode

Parameters

deterministic (bool) – deterministic mode if running in CuDNN backend.
benchmark (bool) – If True use CuDNN heuristics to figure out which algorithm will be most performant for your model architecture and input. Setting it to False may slow down your training.

catalyst.utils.torch.process_model_params(model: torch.nn.modules.module.Module, layerwise_params: Dict[str, dict] = None, no_bias_weight_decay: bool = True, lr_scaling: float = 1.0) → List[Union[torch.nn.parameter.Parameter, dict]][source]¶

Gains model parameters for torch.optim.Optimizer

Parameters

model (torch.nn.Module) – Model to process
layerwise_params (Dict) – Order-sensitive dict where each key is regex pattern and values are layer-wise options for layers matching with a pattern
no_bias_weight_decay (bool) – If true, removes weight_decay for all bias parameters in the model
lr_scaling (float) – layer-wise learning rate scaling, if 1.0, learning rates will not be scaled

Returns

parameters for an optimizer

Return type

iterable

Examples

>>> model = catalyst.contrib.models.segmentation.ResnetUnet()
>>> layerwise_params = collections.OrderedDict([
>>>     ("conv1.*", dict(lr=0.001, weight_decay=0.0003)),
>>>     ("conv.*", dict(lr=0.002))
>>> ])
>>> params = process_model_params(model, layerwise_params)
>>> optimizer = torch.optim.Adam(params, lr=0.0003)

catalyst.utils.torch.set_requires_grad(model: torch.nn.modules.module.Module, requires_grad: bool)[source]¶

Sets the requires_grad value for all model parameters.

Parameters

model (torch.nn.Module) – Model
requires_grad (bool) – value

Examples

>>> model = SimpleModel()
>>> set_requires_grad(model, requires_grad=True)

catalyst.utils.torch.get_network_output(net: torch.nn.modules.module.Module, *input_shapes)[source]¶

For each input shape returns an output tensor

Parameters

net (Model) – the model
*args – variable length argument list of shapes

catalyst.utils.torch.detach(tensor: torch.Tensor) → numpy.ndarray[source]¶: Detaches the input tensor to a numpy array

catalyst.utils.visualization.plot_confusion_matrix(cm, class_names=None, normalize=False, title='confusion matrix', fname=None, show=True, figsize=12, fontsize=32, colormap='Blues')[source]¶: Render the confusion matrix and return matplotlib”s figure with it. Normalization can be applied by setting normalize=True.

catalyst.utils.visualization.render_figure_to_tensor(figure)[source]¶

Utils¶

Catalyst

Navigation

Related Topics