Utils¶
-
catalyst.utils.argparse.
boolean_flag
(parser: argparse.ArgumentParser, name: str, default: Optional[bool] = False, help: str = None, shorthand: str = None) → None[source]¶ Add a boolean flag to a parser inplace.
- Parameters
parser (argparse.ArgumentParser) – parser to add the flag to
name (str) – argument name –<name> will enable the flag, while –no-<name> will disable it
default (bool, optional) – default value of the flag
help (str) – help string for the flag
shorthand (str) – shorthand string for the argument
Examples
>>> parser = argparse.ArgumentParser() >>> boolean_flag( >>> parser, "flag", default=False, help="some flag", shorthand="f" >>> )
-
catalyst.utils.callbacks.
process_callbacks
(callbacks: Union[list, collections.OrderedDict]) → collections.OrderedDict[source]¶ Creates an sequence of callbacks and sort them :param callbacks: either list of callbacks or ordered dict
- Returns
sequence of callbacks sorted by
callback order
-
catalyst.utils.checkpoint.
pack_checkpoint
(model=None, criterion=None, optimizer=None, scheduler=None, **kwargs)[source]¶
-
catalyst.utils.checkpoint.
save_checkpoint
(checkpoint: Dict, logdir: Union[pathlib.Path, str], suffix: str, is_best: bool = False, is_last: bool = False, special_suffix: str = '')[source]¶
-
catalyst.utils.checkpoint.
unpack_checkpoint
(checkpoint, model=None, criterion=None, optimizer=None, scheduler=None)[source]¶
-
catalyst.utils.config.
load_ordered_yaml
(stream, Loader=<class 'yaml.loader.Loader'>, object_pairs_hook=<class 'collections.OrderedDict'>)[source]¶ Loads yaml config into OrderedDict
- Parameters
stream – opened file with yaml
Loader – base class for yaml Loader
object_pairs_hook – type of mapping
- Returns
configuration
- Return type
dict
-
catalyst.utils.config.
get_environment_vars
() → Dict[str, Any][source]¶ Creates a dictionary with environment variables
- Returns
environment variables
- Return type
dict
-
catalyst.utils.config.
dump_environment
(experiment_config: Dict, logdir: str, configs_path: List[str] = None) → None[source]¶ Saves config, environment variables and package list in JSON into logdir
- Parameters
experiment_config (dict) – experiment config
logdir (str) – path to logdir
configs_path – path(s) to config
-
catalyst.utils.config.
parse_args_uargs
(args, unknown_args)[source]¶ Function for parsing configuration files
- Parameters
args – recognized arguments
unknown_args – unrecognized arguments
- Returns
updated arguments, dict with config
- Return type
tuple
-
catalyst.utils.confusion_matrix.
calculate_confusion_matrix_from_arrays
(ground_truth: numpy.array, prediction: numpy.array, num_classes: int) → numpy.array[source]¶ Calculate confusion matrix for a given set of classes. if GT value is outside of the [0, num_classes) it is excluded.
- Parameters
ground_truth –
prediction –
num_classes –
Returns:
-
catalyst.utils.confusion_matrix.
calculate_confusion_matrix_from_tensors
(y_pred_logits: torch.Tensor, y_true: torch.Tensor) → numpy.array[source]¶
-
catalyst.utils.confusion_matrix.
calculate_tp_fp_fn
(confusion_matrix: numpy.array) → numpy.array[source]¶
-
catalyst.utils.dataset.
create_dataframe
(dataset: Dict[str, object], **dataframe_args) → pandas.core.frame.DataFrame[source]¶ Create pd.DataFrame from dict like {key: [values]}
- Parameters
dataset – dict like {key: [values]}
**dataframe_args –
- indexIndex or array-like
Index to use for resulting frame. Will default to np.arange(n) if no indexing information part of input data and no index provided
- columnsIndex or array-like
Column labels to use for resulting frame. Will default to np.arange(n) if no column labels are provided
- dtypedtype, default None
Data type to force, otherwise infer
- Returns
dataframe from giving dataset
- Return type
pd.DataFrame
-
catalyst.utils.dataset.
create_dataset
(dirs: str, extension: str = None, process_fn: Callable[[str], object] = None, recursive: bool = False) → Dict[str, object][source]¶ Create dataset (dict like {key: [values]}) from vctk-like dataset:
dataset/ cat/ *.ext dog/ *.ext
- Parameters
dirs (str) – path to dirs, for example /home/user/data/**
extension (str) – data extension you are looking for
process_fn (Callable[[str], object]) – function(path_to_file) -> object process function for found files, by default
recursive (bool) – enables recursive globbing
- Returns
dataset
- Return type
dict
-
catalyst.utils.dataset.
split_dataset_train_test
(dataset: pandas.core.frame.DataFrame, **train_test_split_args) → Tuple[Dict[str, object], Dict[str, object]][source]¶ Split dataset in train and test parts.
- Parameters
dataset – dict like dataset
**train_test_split_args –
- test_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is automatically set to the complement of the train size. If train size is also None, test size is set to 0.25.
- train_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.
- random_stateint or RandomState
Pseudo-random number generator state used for random sampling.
- stratifyarray-like or None (default is None)
If not None, data is split in a stratified fashion, using this as the class labels.
- Returns
train and test dicts
-
catalyst.utils.ddp.
get_nn_from_ddp_module
(model: torch.nn.modules.module.Module) → torch.nn.modules.module.Module[source]¶ Return a real model from a torch.nn.DataParallel, torch.nn.parallel.DistributedDataParallel, or apex.parallel.DistributedDataParallel.
- Parameters
model – A model, or DataParallel wrapper.
- Returns
A model
-
catalyst.utils.ddp.
is_wrapped_with_ddp
(model: torch.nn.modules.module.Module) → bool[source]¶ Checks whether model is wrapped with DataParallel/DistributedDataParallel.
-
catalyst.utils.dict.
append_dict
(dict1, dict2)[source]¶ Appends dict2 with the same keys as dict1 to dict1
-
catalyst.utils.dict.
flatten_dict
(dictionary: Dict[str, Any], parent_key: str = '', separator: str = '/') → collections.OrderedDict[source]¶ Make the given dictionary flatten
- Parameters
dictionary (dict) – giving dictionary
parent_key (str, optional) – prefix nested keys with string
parent_key
separator (str, optional) – delimiter between
parent_key
andkey
to use
- Returns
ordered dictionary with flatten keys
- Return type
collections.OrderedDict
-
catalyst.utils.dict.
get_dictkey_auto_fn
(key: Union[str, List[str], None]) → Callable[source]¶ Function generator for sub-dict preparation from dict based on predefined keys.
- Parameters
key – keys
- Returns
function
-
catalyst.utils.dict.
get_key_dict
(dictionary: dict, key: Union[str, List[str], None]) → Dict[source]¶ Takes sub-dict from dict by dict-mapping of keys.
- Parameters
dictionary – dict
key – dict-mapping of keys
- Returns
sub-dict
-
catalyst.utils.dict.
get_key_list
(dictionary: dict, key: Union[str, List[str], None]) → Dict[source]¶ Takes sub-dict from dict by list of keys.
- Parameters
dictionary – dict
key – list of keys
- Returns
sub-dict
-
catalyst.utils.dict.
get_key_none
(dictionary: dict, key: Union[str, List[str], None]) → Dict[source]¶ Takes whole dict. :param dictionary: dict :param key: none
- Returns
dict
-
catalyst.utils.dict.
get_key_str
(dictionary: dict, key: Union[str, List[str], None]) → Any[source]¶ Takes value from dict by key.
- Parameters
dictionary – dict
key – key
- Returns
value
-
catalyst.utils.dict.
merge_dicts
(*dicts: dict) → dict[source]¶ Recursive dict merge. Instead of updating only top-level keys,
merge_dicts
recurses down into dicts nested to an arbitrary depth, updating keys.- Parameters
*dicts – several dictionaries to merge
- Returns
deep-merged dictionary
- Return type
dict
-
catalyst.utils.hash.
get_hash
(obj: Any) → str[source]¶ Creates unique hash from object following way: - Represent obj as sting recursively - Hash this string with sha256 hash function - encode hash with url-safe base64 encoding
- Parameters
obj – object to hash
- Returns
base64-encoded string
-
catalyst.utils.image.
has_image_extension
(uri) → bool[source]¶ Check that file has image extension
- Parameters
uri (Union[str, pathlib.Path]) – The resource to load the file from
- Returns
True if file has image extension, False otherwise
- Return type
bool
-
catalyst.utils.image.
imread
(uri, grayscale: bool = False, expand_dims: bool = True, rootpath: Union[str, pathlib.Path] = None, **kwargs)[source]¶ - Parameters
uri – {str, pathlib.Path, bytes, file}
resource to load the image from, e.g. a filename, pathlib.Path, (The) –
address or file object, see the docs for more info. (http) –
grayscale –
expand_dims –
rootpath –
Returns:
-
catalyst.utils.image.
mask_to_overlay_image
(image: numpy.ndarray, masks: List[numpy.ndarray], threshold: float = 0, mask_strength: float = 0.5) → numpy.ndarray[source]¶ Draws every mask for with some color over image
- Parameters
image (np.ndarray) – RGB image used as underlay for masks
masks (List[np.ndarray]) – list of masks
threshold (float) – threshold for masks binarization
mask_strength (float) – opacity of colorized masks
- Returns
HxWx3 image with overlay
- Return type
np.ndarray
-
catalyst.utils.image.
mimread
(uri, clip_range: Tuple[int, int] = None, expand_dims: bool = True, rootpath: Union[str, pathlib.Path] = None, **kwargs)[source]¶ - Parameters
uri – {str, pathlib.Path, bytes, file}
resource to load the mask from, e.g. a filename, pathlib.Path, (The) –
address or file object, see the docs for more info. (http) –
clip_range (Tuple[int, int]) – lower and upper interval edges, image values outside the interval are clipped to the interval edges
expand_dims (bool) – if True, append channel axis to grayscale images
rootpath (Union[str, pathlib.Path]) – path to an image (allows to use relative path)
- Returns
Image
- Return type
np.ndarray
-
catalyst.utils.image.
tensor_to_ndimage
(images: torch.Tensor, denormalize: bool = True, mean: Tuple[float, float, float] = (0.485, 0.456, 0.406), std: Tuple[float, float, float] = (0.229, 0.224, 0.225), move_channels_dim: bool = True, dtype=<class 'numpy.float32'>) → numpy.ndarray[source]¶ Convert float image(s) with standard normalization to np.ndarray with [0..1] when dtype is np.float32 and [0..255] when dtype is np.uint8.
- Parameters
images (torch.Tensor) – [B]xCxHxW float tensor
denormalize (bool) – if True, multiply image(s) by std and add mean
mean (Tuple[float, float, float]) – per channel mean to add
std (Tuple[float, float, float]) – per channel std to multiply
move_channels_dim (bool) – if True, convert tensor to [B]xHxWxC format
dtype – result ndarray dtype. Only float32 and uint8 are supported.
- Returns
[B]xHxWxC np.ndarray of dtype
-
catalyst.utils.initialization.
bias_init_with_prob
(prior_prob)[source]¶ Initialize conv/fc bias value according to giving probablity
-
catalyst.utils.initialization.
constant_init
(module, val, bias=0)[source]¶ Initialize the module with constant value
-
catalyst.utils.initialization.
create_optimal_inner_init
(nonlinearity: torch.nn.modules.module.Module, **kwargs) → Callable[[torch.nn.modules.module.Module], None][source]¶ Create initializer for inner layers based on their activation function (nonlinearity).
- Parameters
nonlinearity – non-linear activation
-
catalyst.utils.initialization.
kaiming_init
(module, mode='fan_out', nonlinearity='relu', bias=0, distribution='normal')[source]¶ Initialize the module with he initialization
-
catalyst.utils.initialization.
normal_init
(module, mean=0, std=1, bias=0)[source]¶ Initialize the module with normal distribution
-
catalyst.utils.initialization.
outer_init
(layer: torch.nn.modules.module.Module) → None[source]¶ Initialization for output layers of policy and value networks typically used in deep reinforcement learning literature.
-
catalyst.utils.initialization.
uniform_init
(module, a=0, b=1, bias=0)[source]¶ Initialize the module with uniform distribution
-
catalyst.utils.initialization.
xavier_init
(module, gain=1, bias=0, distribution='normal')[source]¶ Initialize the module with xavier initialization
-
catalyst.utils.misc.
args_are_not_none
(*args: Optional[Any]) → bool[source]¶ Check that all arguments are not None :param *args: values :type *args: Any
- Returns
True if all value were not None, False otherwise
- Return type
bool
-
catalyst.utils.misc.
copy_directory
(input_dir: pathlib.Path, output_dir: pathlib.Path) → None[source]¶ Recursively copies the input directory
- Parameters
input_dir (Path) – input directory
output_dir (Path) – output directory
-
catalyst.utils.misc.
format_metric
(name: str, value: float) → str[source]¶ Format metric. Metric will be returned in the scientific format if 4 decimal chars are not enough (metric value lower than 1e-4)
- Parameters
name (str) – metric name
value (float) – value of metric
-
catalyst.utils.misc.
get_utcnow_time
(format: str = None) → str[source]¶ Return string with current utc time in chosen format
- Parameters
format (str) – format string. if None “%y%m%d.%H%M%S” will be used.
- Returns
formatted utc time string
- Return type
str
-
catalyst.utils.misc.
is_exception
(ex: Any) → bool[source]¶ Check if the argument is of Exception type
-
catalyst.utils.misc.
make_tuple
(tuple_like)[source]¶ Creates a tuple if given
tuple_like
value isn’t list or tuple- Returns
tuple or list
-
catalyst.utils.misc.
maybe_recursive_call
(object_or_dict, method: str, recursive_args=None, recursive_kwargs=None, **kwargs)[source]¶ Calls the
method
recursively for the object_or_dict- Parameters
object_or_dict (Any) – some object or a dictinary of objects
method (str) – method name to call
recursive_args – list of arguments to pass to the
method
recursive_kwargs – list of key-arguments to pass to the
method
**kwargs – Arbitrary keyword arguments
-
catalyst.utils.misc.
pairwise
(iterable: Iterable[Any]) → Iterable[Any][source]¶ Iterate sequences by pairs
- Parameters
iterable – Any iterable sequence
- Returns
pairwise iterator
Examples
>>> for i in pairwise([1, 2, 5, -3]): >>> print(i) (1, 2) (2, 5) (5, -3)
-
catalyst.utils.notebook.
save_notebook
(filepath: str, wait_period: float = 0.1, max_wait_time=1.0)[source]¶
-
catalyst.utils.numpy.
geometric_cumsum
(alpha, x)[source]¶ Calculate future accumulated sums for each element in a list with an exponential factor.
Given input data \(x_1, \dots, x_n\) # noqa: E501, W605 and exponential factor \(lpha\in [0, 1]\), # noqa: E501, W605 it returns an array \(y\) with the same length and each element is calculated as following
\[y_i = x_i + lpha x_{i+1} + lpha^2 x_{i+2} + \dots + lpha^{n-i-1}x_{n-1} + lpha^{n-i}x_{n} # noqa: E501, W605\]Note
To gain the optimal runtime speed, we use
scipy.signal.lfilter
Example
>>> geometric_cumsum(0.1, [[1, 1], [2, 2], [3, 3], [4, 4]]) array([[1.234, 1.234], [2.34 , 2.34 ], [3.4 , 3.4 ], [4. , 4. ]])
- Parameters
alpha (float) – exponential factor between zero and one.
x (np.ndarray) – input data, [trajectory_len, num_atoms]
- Returns
calculated data
- Return type
out (np.ndarray)
-
catalyst.utils.numpy.
get_one_hot
(label: int, num_classes: int, smoothing: float = None) → numpy.ndarray[source]¶ Applies OneHot vectorization to a giving scalar, optional with label smoothing from https://arxiv.org/abs/1812.01187
- Parameters
label (int) – scalar value to be vectorized
num_classes (int) – total number of classes
smoothing (float, optional) – if specified applies label smoothing from
Bag of Tricks for Image Classification with Convolutional Neural Networks
paper
- Returns
a one-hot vector with shape
(num_classes,)
- Return type
np.ndarray
-
catalyst.utils.pandas.
balance_classes
(dataframe: pandas.core.frame.DataFrame, class_column: str = 'label', random_state: int = 42, how: str = 'downsampling') → pandas.core.frame.DataFrame[source]¶ Balance classes in dataframe by
class_column
See also
catalyst.data.sampler.BalanceClassSampler
- Parameters
dataframe – a dataset
class_column – which column to use for split
random_state – seed for random shuffle
how – strategy to sample must be one on [“downsampling”, “upsampling”]
- Returns
new dataframe with balanced
class_column
- Return type
pd.DataFrame
-
catalyst.utils.pandas.
dataframe_to_list
(dataframe: pandas.core.frame.DataFrame) → List[dict][source]¶ Converts dataframe to a list of rows (without indexes)
- Parameters
dataframe (DataFrame) – input dataframe
- Returns
list of rows
- Return type
(List[dict])
-
catalyst.utils.pandas.
folds_to_list
(folds: Union[list, str, pandas.core.series.Series]) → List[int][source]¶ This function formats string or either list of numbers into a list of unique int
- Parameters
folds (Union[list, str, pd.Series]) – Either list of numbers or one string with numbers separated by commas or pandas series
- Returns
list of unique ints
- Return type
List[int]
Examples
>>> folds_to_list("1,2,1,3,4,2,4,6") [1, 2, 3, 4, 6] >>> folds_to_list([1, 2, 3.0, 5]) [1, 2, 3, 5]
- Raises
ValueError – if value in string or array cannot be casted to int
-
catalyst.utils.pandas.
get_dataset_labeling
(dataframe: pandas.core.frame.DataFrame, tag_column: str) → Dict[str, int][source]¶ Prepares a mapping using unique values from
tag_column
{ "class_name_0": 0, "class_name_1": 1, ... "class_name_N": N }
- Parameters
dataframe – a dataset
tag_column – which column to use
- Returns
mapping from tag to labels
- Return type
Dict[str, int]
-
catalyst.utils.pandas.
map_dataframe
(dataframe: pandas.core.frame.DataFrame, tag_column: str, class_column: str, tag2class: Dict[str, int], verbose: bool = False) → pandas.core.frame.DataFrame[source]¶ This function maps tags from
tag_column
to ints intoclass_column
Usingtag2class
dictionary- Parameters
dataframe (pd.DataFrame) – input dataframe
tag_column (str) – column with tags
class_column (str) –
tag2class (Dict[str, int]) – mapping from tags to class labels
verbose – flag if true, uses tqdm
- Returns
updated dataframe with
class_column
- Return type
pd.DataFrame
-
catalyst.utils.pandas.
merge_multiple_fold_csv
(fold_name: str, paths: Optional[str]) → pandas.core.frame.DataFrame[source]¶ Reads csv into one DataFrame with column
fold
:param fold_name: current fold name :type fold_name: str :param paths: paths to csv separated by commas :type paths: str- Returns
merged dataframes with column
fold
==fold_name
- Return type
pd.DataFrame
-
catalyst.utils.pandas.
read_csv_data
(in_csv: str = None, train_folds: Optional[List[int]] = None, valid_folds: Optional[List[int]] = None, infer_folds: Optional[List[int]] = None, seed: int = 42, n_folds: int = 5, in_csv_train: str = None, in_csv_valid: str = None, in_csv_infer: str = None, tag2class: Optional[Dict[str, int]] = None, class_column: str = None, tag_column: str = None) → Tuple[pandas.core.frame.DataFrame, List[dict], List[dict], List[dict]][source]¶ From giving path
in_csv
reads a dataframe and split it to train/valid/infer folds or from several pathsin_csv_train
,in_csv_valid
,in_csv_infer
reads independent folds.Note
- This function can be used with different combinations of params.
- First block is used to get dataset from one csv:
in_csv, train_folds, valid_folds, infer_folds, seed, n_folds
- Second includes paths to different csv for train/valid and infer parts:
in_csv_train, in_csv_valid, in_csv_infer
- The other params (tag2class, tag_column, class_column) are optional
for any previous block
- Parameters
in_csv (str) – paths to whole dataset
train_folds (List[int]) – train folds
valid_folds (List[int], optional) – valid folds. If none takes all folds not included in
train_folds
infer_folds (List[int], optional) – infer folds. If none takes all folds not included in
train_folds
andvalid_folds
seed (int) – seed for split
n_folds (int) – number of folds
in_csv_train (str) – paths to train csv separated by commas
in_csv_valid (str) – paths to valid csv separated by commas
in_csv_infer (str) – paths to infer csv separated by commas
tag2class (Dict[str, int]) – mapping from label names into ints
tag_column (str) – column with label names
class_column (str) – column to use for split
- Returns
tuple with 4 elements (whole dataframe, list with train data, list with valid data and list with infer data)
- Return type
(Tuple[pd.DataFrame, List[dict], List[dict], List[dict]])
-
catalyst.utils.pandas.
read_multiple_dataframes
(in_csv_train: str = None, in_csv_valid: str = None, in_csv_infer: str = None, tag2class: Optional[Dict[str, int]] = None, class_column: str = None, tag_column: str = None) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]¶ This function reads train/valid/infer dataframes from giving paths :param in_csv_train: paths to train csv separated by commas :type in_csv_train: str :param in_csv_valid: paths to valid csv separated by commas :type in_csv_valid: str :param in_csv_infer: paths to infer csv separated by commas :type in_csv_infer: str :param tag2class: mapping from label names into int :type tag2class: Dict[str, int], optional :param tag_column: column with label names :type tag_column: str, optional :param class_column: column to use for split :type class_column: str, optional
- Returns
- tuple with 4 dataframes
whole dataframe, train part, valid part and infer part
- Return type
(tuple)
Separates values in
class_column
column- Parameters
dataframe – a dataset
tag_column – column name to separate values
tag_delim – delimiter to separate values
- Returns
new dataframe
- Return type
pd.DataFrame
-
catalyst.utils.pandas.
split_dataframe
(dataframe: pandas.core.frame.DataFrame, train_folds: List[int], valid_folds: Optional[List[int]] = None, infer_folds: Optional[List[int]] = None, tag2class: Optional[Dict[str, int]] = None, tag_column: str = None, class_column: str = None, seed: int = 42, n_folds: int = 5) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]¶ Split a Pandas DataFrame into folds.
- Parameters
dataframe (pd.DataFrame) – input dataframe
train_folds (List[int]) – train folds
valid_folds (List[int], optional) – valid folds. If none takes all folds not included in
train_folds
infer_folds (List[int], optional) – infer folds. If none takes all folds not included in
train_folds
andvalid_folds
tag2class (Dict[str, int], optional) – mapping from label names into int
tag_column (str, optional) – column with label names
class_column (str, optional) – column to use for split
seed (int) – seed for split
n_folds (int) – number of folds
- Returns
- tuple with 4 dataframes
whole dataframe, train part, valid part and infer part
- Return type
(tuple)
-
catalyst.utils.pandas.
split_dataframe_on_column_folds
(dataframe: pandas.core.frame.DataFrame, column: str, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]¶ Splits DataFrame into N folds.
- Parameters
dataframe – a dataset
column – which column to use
random_state – seed for random shuffle
n_folds – number of result folds
- Returns
new dataframe with fold column
- Return type
pd.DataFrame
-
catalyst.utils.pandas.
split_dataframe_on_folds
(dataframe: pandas.core.frame.DataFrame, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]¶ Splits DataFrame into N folds.
- Parameters
dataframe – a dataset
random_state – seed for random shuffle
n_folds – number of result folds
- Returns
new dataframe with fold column
- Return type
pd.DataFrame
-
catalyst.utils.pandas.
split_dataframe_on_stratified_folds
(dataframe: pandas.core.frame.DataFrame, class_column: str, random_state: int = 42, n_folds: int = 5) → pandas.core.frame.DataFrame[source]¶ Splits DataFrame into N stratified folds.
Also see
catalyst.data.sampler.BalanceClassSampler
- Parameters
dataframe – a dataset
class_column – which column to use for split
random_state – seed for random shuffle
n_folds – number of result folds
- Returns
new dataframe with fold column
- Return type
pd.DataFrame
-
catalyst.utils.pandas.
split_dataframe_train_test
(dataframe: pandas.core.frame.DataFrame, **train_test_split_args) → Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]¶ Split dataframe in train and test part.
- Parameters
dataframe – pd.DataFrame to split
**train_test_split_args –
- test_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is automatically set to the complement of the train size. If train size is also None, test size is set to 0.25.
- train_sizefloat, int, or None (default is None)
If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.
- random_stateint or RandomState
Pseudo-random number generator state used for random sampling.
- stratifyarray-like or None (default is None)
If not None, data is split in a stratified fashion, using this as the class labels.
- Returns
train and test DataFrames
PS. It exist cause sklearn split is overcomplicated.
-
catalyst.utils.parallel.
get_pool
(workers: int) → Union[multiprocessing.pool.Pool, catalyst.utils.parallel.DumbPool][source]¶
-
catalyst.utils.parallel.
parallel_imap
(func, args, pool: Union[multiprocessing.pool.Pool, catalyst.utils.parallel.DumbPool]) → List[T][source]¶
-
catalyst.utils.parallel.
tqdm_parallel_imap
(func, args, pool: Union[multiprocessing.pool.Pool, catalyst.utils.parallel.DumbPool], total: int = None, pbar=<class 'tqdm.std.tqdm'>) → List[T][source]¶
-
catalyst.utils.plotly.
plot_tensorboard_log
(logdir: Union[str, pathlib.Path], step: Optional[str] = 'batch', metrics: Optional[List[str]] = None, height: Optional[int] = None, width: Optional[int] = None) → None[source]¶
-
catalyst.utils.seed.
set_global_seed
(seed: int) → None[source]¶ Sets random seed into PyTorch, TensorFlow, Numpy and Random.
- Parameters
seed – random seed
-
catalyst.utils.serialization.
deserialize
(data)¶ Deserialize bytes into an object using pickle
- Parameters
bytes – a bytes object containing serialized with pickle data.
- Returns
Returns a value deserialized from the bytes-like object.
-
catalyst.utils.serialization.
pickle_deserialize
(data)[source]¶ Deserialize bytes into an object using pickle
- Parameters
bytes – a bytes object containing serialized with pickle data.
- Returns
Returns a value deserialized from the bytes-like object.
-
catalyst.utils.serialization.
pickle_serialize
(data)[source]¶ Serialize the data into bytes using pickle
- Parameters
data – a value
- Returns
Returns a bytes object serialized with pickle data.
-
catalyst.utils.serialization.
pyarrow_deserialize
(data)[source]¶ Deserialize bytes into an object using pyarrow
- Parameters
bytes – a bytes object containing serialized with pyarrow data.
- Returns
Returns a value deserialized from the bytes-like object.
-
catalyst.utils.serialization.
pyarrow_serialize
(data)[source]¶ Serialize the data into bytes using pyarrow
- Parameters
data – a value
- Returns
Returns a bytes object serialized with pyarrow data.
-
catalyst.utils.serialization.
serialize
(data)¶ Serialize the data into bytes using pickle
- Parameters
data – a value
- Returns
Returns a bytes object serialized with pickle data.
-
catalyst.utils.torch.
ce_with_logits
(logits, target)[source]¶ Returns cross entropy for giving logits
-
catalyst.utils.torch.
log1p_exp
(x)[source]¶ Computationally stable function for computing log(1+exp(x)).
-
catalyst.utils.torch.
normal_sample
(mu, sigma)[source]¶ Sample from multivariate Gaussian distribution z ~ N(z|mu,sigma) while supporting backpropagation through its mean and variance.
-
catalyst.utils.torch.
normal_logprob
(mu, sigma, z)[source]¶ Probability density function of multivariate Gaussian distribution N(z|mu,sigma).
-
catalyst.utils.torch.
soft_update
(target, source, tau)[source]¶ Updates the target data with smoothing by
tau
-
catalyst.utils.torch.
get_optimizable_params
(model_or_params)[source]¶ Returns all the parameters that requires gradients
-
catalyst.utils.torch.
get_optimizer_momentum
(optimizer: torch.optim.optimizer.Optimizer) → float[source]¶ Get momentum of current optimizer.
- Parameters
optimizer – PyTorch optimizer
- Returns
momentum at first param group
- Return type
float
-
catalyst.utils.torch.
set_optimizer_momentum
(optimizer: torch.optim.optimizer.Optimizer, value: float, index: int = 0)[source]¶ Set momentum of
index
‘th param group of optimizer tovalue
- Parameters
optimizer – PyTorch optimizer
value (float) – new value of momentum
index (int, optional) – integer index of optimizer’s param groups, default is 0
-
catalyst.utils.torch.
assert_fp16_available
() → None[source]¶ Asserts for installed and available Apex FP16
-
catalyst.utils.torch.
get_device
() → torch.device[source]¶ Simple returning the best available device (GPU or CPU)
-
catalyst.utils.torch.
get_available_gpus
()[source]¶ Array of available GPU ids :returns: available GPU ids :rtype: iterable
Examples
>>> os.environ["CUDA_VISIBLE_DEVICES"] = "0,2" >>> get_available_gpus() >>> [0, 2]
>>> os.environ["CUDA_VISIBLE_DEVICES"] = "0,-1,1" >>> get_available_gpus() >>> [0]
>>> os.environ["CUDA_VISIBLE_DEVICES"] = "" >>> get_available_gpus() >>> []
>>> os.environ["CUDA_VISIBLE_DEVICES"] = "-1" >>> get_available_gpus() >>> []
-
catalyst.utils.torch.
get_activation_fn
(activation: str = None)[source]¶ Returns the activation function from
torch.nn
by its name
-
catalyst.utils.torch.
any2device
(value, device: Union[str, torch.device])[source]¶ Move tensor, list of tensors, list of list of tensors, dict of tensors, tuple of tensors to target device.
- Parameters
value – Object to be moved
device (Device) – target device ids
- Returns
Same structure as value, but all tensors and np.arrays moved to device
-
catalyst.utils.torch.
prepare_cudnn
(deterministic: bool = None, benchmark: bool = None) → None[source]¶ Prepares CuDNN benchmark and sets CuDNN to be deterministic/non-deterministic mode
- Parameters
deterministic (bool) – deterministic mode if running in CuDNN backend.
benchmark (bool) – If
True
use CuDNN heuristics to figure out which algorithm will be most performant for your model architecture and input. Setting it toFalse
may slow down your training.
-
catalyst.utils.torch.
process_model_params
(model: torch.nn.modules.module.Module, layerwise_params: Dict[str, dict] = None, no_bias_weight_decay: bool = True, lr_scaling: float = 1.0) → List[Union[torch.nn.parameter.Parameter, dict]][source]¶ Gains model parameters for
torch.optim.Optimizer
- Parameters
model (torch.nn.Module) – Model to process
layerwise_params (Dict) – Order-sensitive dict where each key is regex pattern and values are layer-wise options for layers matching with a pattern
no_bias_weight_decay (bool) – If true, removes weight_decay for all
bias
parameters in the modellr_scaling (float) – layer-wise learning rate scaling, if 1.0, learning rates will not be scaled
- Returns
parameters for an optimizer
- Return type
iterable
Examples
>>> model = catalyst.contrib.models.segmentation.ResnetUnet() >>> layerwise_params = collections.OrderedDict([ >>> ("conv1.*", dict(lr=0.001, weight_decay=0.0003)), >>> ("conv.*", dict(lr=0.002)) >>> ]) >>> params = process_model_params(model, layerwise_params) >>> optimizer = torch.optim.Adam(params, lr=0.0003)
-
catalyst.utils.torch.
set_requires_grad
(model: torch.nn.modules.module.Module, requires_grad: bool)[source]¶ Sets the
requires_grad
value for all model parameters.- Parameters
model (torch.nn.Module) – Model
requires_grad (bool) – value
Examples
>>> model = SimpleModel() >>> set_requires_grad(model, requires_grad=True)
-
catalyst.utils.torch.
get_network_output
(net: torch.nn.modules.module.Module, *input_shapes)[source]¶ For each input shape returns an output tensor
- Parameters
net (Model) – the model
*args – variable length argument list of shapes
-
catalyst.utils.torch.
detach
(tensor: torch.Tensor) → numpy.ndarray[source]¶ Detaches the input tensor to a numpy array
-
catalyst.utils.visualization.
plot_confusion_matrix
(cm, class_names=None, normalize=False, title='confusion matrix', fname=None, show=True, figsize=12, fontsize=32, colormap='Blues')[source]¶ Render the confusion matrix and return matplotlib”s figure with it. Normalization can be applied by setting normalize=True.
Tools¶
-
class
catalyst.utils.tools.metric_manager.
MetricManager
(valid_loader: str = 'valid', main_metric: str = 'loss', minimize: bool = True, batch_consistant_metrics: bool = True)[source]¶ Bases:
object
-
property
batch_values
¶
-
property
main_metric_value
¶
-
property
-
class
catalyst.utils.tools.dynamic_array.
DynamicArray
(array_or_shape=(None, ), dtype=None, capacity=64, allow_views_on_resize=False)[source]¶ Bases:
object
Dynamically growable numpy array. Credits to https://github.com/maciejkula/dynarray
- Parameters
array_or_shape (numpy array or tuple) – If an array, a growable array with the same shape, dtype, and a copy of the data will be created. The array will grow along the first dimension. If a tuple, en empty array of the specified shape will be created. The first element needs to be None to denote that the array will grow along the first dimension.
dtype (optional, array dtype) – The dtype the array should have.
capacity (int, optional) – The initial capacity of the array.
allow_views_on_resize (boolean, optional) – If False, an exception will be thrown if the array is resized while there are live references to the array”s contents. When the array is resized, these will point at old data. Set to True if you want to silence the exception.
Examples
Create a multidimensional array and append rows:
>>> from dynarray import DynamicArray >>> # The leading dimension is None to denote that this is >>> # the dynamic dimension >>> array = DynamicArray((None, 20, 10)) >>> array.append(np.random.random((20, 10))) >>> array.extend(np.random.random((100, 20, 10)))
Slice and perform arithmetic like with normal numpy arrays:
>>> array[:2]
-
MAGIC_METHODS
= ('__radd__', '__add__', '__sub__', '__rsub__', '__mul__', '__rmul__', '__div__', '__rdiv__', '__pow__', '__rpow__', '__eq__', '__len__')¶
-
__init__
(array_or_shape=(None, ), dtype=None, capacity=64, allow_views_on_resize=False)[source]¶ Init
-
append
(value)[source]¶ Append a row to the array.
The row’s shape has to match the array’s trailing dimensions.
-
property
capacity
¶ Capacity of the array
-
property
dtype
¶ Dtype of the array
-
extend
(values)[source]¶ Extend the array with a set of rows.
The rows” dimensions must match the trailing dimensions of the array.
-
property
shape
¶ Shape of the array
-
class
catalyst.utils.tools.frozen_class.
FrozenClass
[source]¶ Bases:
object
Class which prohibit
__setattr__
on existing attributesExamples
>>> class State(FrozenClass):
-
class
catalyst.utils.tools.registry.
Registry
(default_name_key: str, default_meta_factory: Callable[[Union[Type, Callable[[...], Any]], Tuple, Mapping], Any] = <function _default_meta_factory>)[source]¶ Bases:
collections.abc.MutableMapping
Universal class allowing to add and access various factories by name
-
__init__
(default_name_key: str, default_meta_factory: Callable[[Union[Type, Callable[[...], Any]], Tuple, Mapping], Any] = <function _default_meta_factory>)[source]¶ - Parameters
default_name_key (str) – Default key containing factory name when creating from config
default_meta_factory (MetaFactory) – default object that calls factory. Optional. Default just calls factory.
-
add
(factory: Union[Type, Callable[[...], Any]] = None, *factories: Union[Type, Callable[[...], Any]], name: str = None, **named_factories: Union[Type, Callable[[...], Any]]) → Union[Type, Callable[[...], Any]][source]¶ Adds factory to registry with it’s
__name__
attribute or provided name. Signature is flexible.- Parameters
factory – Factory instance
factories – More instances
name – Provided name for first instance. Use only when pass single instance.
named_factories – Factory and their names as kwargs
- Returns
First factory passed
- Return type
(Factory)
-
add_from_module
(module, prefix: Union[str, List[str]] = None) → None[source]¶ Adds all factories present in module. If
__all__
attribute is present, takes ony what mentioned in it- Parameters
module – module to scan
prefix (Union[str, List[str]]) – prefix string for all the module’s factories. If prefix is a list, all values will be treated as aliases.
-
get
(name: str) → Union[Type, Callable[[...], Any], None][source]¶ Retrieves factory, without creating any objects with it or raises error
- Parameters
name – factory name
- Returns
factory by name
- Return type
Factory
-
get_from_params
(*, meta_factory=None, **kwargs) → Union[Any, Tuple[Any, Mapping[str, Any]]][source]¶ Creates instance based in configuration dict with
instantiation_fn
. Ifconfig[name_key]
is None, None is returned.- Parameters
meta_factory – Function that calls factory the right way. If not provided, default is used.
**kwargs – additional kwargs for factory
- Returns
result of calling
instantiate_fn(factory, **config)
-
get_if_str
(obj: Union[str, Type, Callable[[...], Any]])[source]¶ Returns object from the registry if
obj
type is string
-
get_instance
(name: str, *args, meta_factory=None, **kwargs)[source]¶ Creates instance by calling specified factory with instantiate_fn
- Parameters
name – factory name
meta_factory – Function that calls factory the right way. If not provided, default is used
args – args to pass to the factory
**kwargs – kwargs to pass to the factory
- Returns
created instance
-
-
exception
catalyst.utils.tools.registry.
RegistryException
(message)[source]¶ Bases:
Exception
Exception class for all registry errors
-
class
catalyst.utils.tools.seeder.
Seeder
(init_seed: int = 0, max_seed: int = None)[source]¶ Bases:
object
A random seed generator.
Given an initial seed, the seeder can be called continuously to sample a single or a batch of random seeds.
Note
The seeder creates an independent RandomState to generate random numbers. It does not affect the RandomState in
np.random
.- Example::
>>> seeder = Seeder(init_seed=0) >>> seeder(size=5) [209652396, 398764591, 924231285, 1478610112, 441365315]
-
exception
catalyst.utils.tools.tensorboard.
EventReadingError
[source]¶ Bases:
Exception
An exception that correspond to an event file reading error
-
class
catalyst.utils.tools.tensorboard.
EventsFileReader
(events_file: BinaryIO)[source]¶ Bases:
collections.abc.Iterable
An iterator over a Tensorboard events file
-
class
catalyst.utils.tools.tensorboard.
SummaryItem
(tag, step, wall_time, value, type)¶ Bases:
tuple
-
property
step
¶ Alias for field number 1
-
property
tag
¶ Alias for field number 0
-
property
type
¶ Alias for field number 4
-
property
value
¶ Alias for field number 3
-
property
wall_time
¶ Alias for field number 2
-
property
-
class
catalyst.utils.tools.tensorboard.
SummaryReader
(logdir: Union[str, pathlib.Path], tag_filter: Optional[collections.abc.Iterable] = None, types: collections.abc.Iterable = ('scalar',))[source]¶ Bases:
collections.abc.Iterable
Iterates over events in all the files in the current logdir. Only scalars and images are supported at the moment.
-
__init__
(logdir: Union[str, pathlib.Path], tag_filter: Optional[collections.abc.Iterable] = None, types: collections.abc.Iterable = ('scalar',))[source]¶ Initalize new summary reader
- Parameters
logdir – A directory with Tensorboard summary data
tag_filter – A list of tags to leave (None for all)
types – A list of types to get.
"scalar" and "image" types are allowed at the moment. (Only) –
-
-
catalyst.utils.tools.typing.
Model
¶ alias of
torch.nn.modules.module.Module
-
catalyst.utils.tools.typing.
Criterion
¶ alias of
torch.nn.modules.module.Module
-
class
catalyst.utils.tools.typing.
Optimizer
(params, defaults)[source]¶ Bases:
object
Base class for all optimizers.
Warning
Parameters need to be specified as collections that have a deterministic ordering that is consistent between runs. Examples of objects that don’t satisfy those properties are sets and iterators over values of dictionaries.
- Parameters
params (iterable) – an iterable of
torch.Tensor
s ordict
s. Specifies what Tensors should be optimized.defaults – (dict): a dict containing default values of optimization options (used when a parameter group doesn’t specify them).
-
add_param_group
(param_group)[source]¶ Add a param group to the
Optimizer
s param_groups.This can be useful when fine tuning a pre-trained network as frozen layers can be made trainable and added to the
Optimizer
as training progresses.- Parameters
param_group (dict) – Specifies what Tensors should be optimized along with group
optimization options. (specific) –
-
load_state_dict
(state_dict)[source]¶ Loads the optimizer state.
- Parameters
state_dict (dict) – optimizer state. Should be an object returned from a call to
state_dict()
.
-
state_dict
()[source]¶ Returns the state of the optimizer as a
dict
.It contains two entries:
- state - a dict holding current optimization state. Its content
differs between optimizer classes.
param_groups - a dict containing all parameter groups
-
catalyst.utils.tools.typing.
Scheduler
¶ alias of
torch.optim.lr_scheduler._LRScheduler
-
class
catalyst.utils.tools.typing.
Dataset
[source]¶ Bases:
object
An abstract class representing a
Dataset
.All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite
__getitem__()
, supporting fetching a data sample for a given key. Subclasses could also optionally overwrite__len__()
, which is expected to return the size of the dataset by manySampler
implementations and the default options ofDataLoader
.Note
DataLoader
by default constructs a index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.
Criterion¶
-
catalyst.utils.criterion.accuracy.
accuracy
(outputs, targets, topk=(1, ), threshold: float = None, activation: str = None)[source]¶ Computes the accuracy.
- It can be used either for:
- multi-class task:
-you can use topk. -threshold and activation are not required. -targets is a tensor: batch_size -outputs is a tensor: batch_size x num_classes -computes the accuracy@k for the specified values of k.
- OR multi-label task, in this case:
-you must specify threshold and activation -topk will not be used (because of there is no method to apply top-k in multi-label classification). -outputs, targets are tensors with shape: batch_size x num_classes -targets is a tensor with binary vectors
-
catalyst.utils.criterion.accuracy.
average_accuracy
(outputs, targets, k=10)[source]¶ Computes the average accuracy at k. This function computes the average accuracy at k between two lists of items.
- Parameters
outputs (list) – A list of predicted elements
targets (list) – A list of elements that are to be predicted
k (int, optional) – The maximum number of predicted elements
- Returns
The average accuracy at k over the input lists
- Return type
double
-
catalyst.utils.criterion.accuracy.
mean_average_accuracy
(outputs, targets, topk=(1, ))[source]¶ Computes the mean average accuracy at k. This function computes the mean average accuracy at k between two lists of lists of items.
- Parameters
outputs (list) – A list of lists of predicted elements
targets (list) – A list of lists of elements that are to be predicted
topk (int, optional) – The maximum number of predicted elements
- Returns
The mean average accuracy at k over the input lists
- Return type
double
-
catalyst.utils.criterion.dice.
dice
(outputs: torch.Tensor, targets: torch.Tensor, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]¶ Computes the dice metric
- Parameters
outputs (list) – A list of predicted elements
targets (list) – A list of elements that are to be predicted
eps (float) – epsilon
threshold (float) – threshold for outputs binarization
activation (str) – An torch.nn activation applied to the outputs. Must be one of [“none”, “Sigmoid”, “Softmax2d”]
- Returns
Dice score
- Return type
double
-
catalyst.utils.criterion.f1_score.
f1_score
(outputs: torch.Tensor, targets: torch.Tensor, beta: float = 1.0, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid')[source]¶ Source https://github.com/qubvel/segmentation_models.pytorch
- Parameters
outputs (torch.Tensor) – A list of predicted elements
targets (torch.Tensor) – A list of elements that are to be predicted
eps (float) – epsilon to avoid zero division
beta (float) – beta param for f_score
threshold (float) – threshold for outputs binarization
activation (str) – An torch.nn activation applied to the outputs. Must be one of [“none”, “Sigmoid”, “Softmax2d”]
- Returns
F_1 score
- Return type
float
-
catalyst.utils.criterion.focal.
sigmoid_focal_loss
(outputs: torch.Tensor, targets: torch.Tensor, gamma: float = 2.0, alpha: float = 0.25, reduction: str = 'mean')[source]¶ Compute binary focal loss between target and output logits.
Source https://github.com/BloodAxe/pytorch-toolbelt See
losses
for details.- Parameters
outputs – Tensor of arbitrary shape
targets – Tensor of the same shape as input
reduction (string, optional) – Specifies the reduction to apply to the output: “none” | “mean” | “sum” | “batchwise_mean”. “none”: no reduction will be applied, “mean”: the sum of the output will be divided by the number of elements in the output, “sum”: the output will be summed.
See https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/loss/losses.py # noqa: E501
-
catalyst.utils.criterion.focal.
reduced_focal_loss
(outputs: torch.Tensor, targets: torch.Tensor, threshold: float = 0.5, gamma: float = 2.0, reduction='mean')[source]¶ Compute reduced focal loss between target and output logits.
Source https://github.com/BloodAxe/pytorch-toolbelt See
losses
for details.- Parameters
outputs – Tensor of arbitrary shape
targets – Tensor of the same shape as input
reduction (string, optional) –
Specifies the reduction to apply to the output: “none” | “mean” | “sum” | “batchwise_mean”.
”none”: no reduction will be applied, “mean”: the sum of the output will be divided by the number of elements in the output, “sum”: the output will be summed.
Note:
size_average
andreduce
are in the process of being deprecated, and in the meantime, specifying either of those two args will overridereduction
.”batchwise_mean” computes mean loss per sample in batch. Default: “mean”
-
catalyst.utils.criterion.iou.
iou
(outputs: torch.Tensor, targets: torch.Tensor, classes: List[str] = None, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid') → Union[float, List[float]][source]¶ - Parameters
outputs (torch.Tensor) – A list of predicted elements
targets (torch.Tensor) – A list of elements that are to be predicted
eps (float) – epsilon to avoid zero division
threshold (float) – threshold for outputs binarization
activation (str) – An torch.nn activation applied to the outputs. Must be one of [“none”, “Sigmoid”, “Softmax2d”]
- Returns
IoU (Jaccard) score(s)
- Return type
Union[float, List[float]]
-
catalyst.utils.criterion.iou.
jaccard
(outputs: torch.Tensor, targets: torch.Tensor, classes: List[str] = None, eps: float = 1e-07, threshold: float = None, activation: str = 'Sigmoid') → Union[float, List[float]]¶ - Parameters
outputs (torch.Tensor) – A list of predicted elements
targets (torch.Tensor) – A list of elements that are to be predicted
eps (float) – epsilon to avoid zero division
threshold (float) – threshold for outputs binarization
activation (str) – An torch.nn activation applied to the outputs. Must be one of [“none”, “Sigmoid”, “Softmax2d”]
- Returns
IoU (Jaccard) score(s)
- Return type
Union[float, List[float]]
Meters¶
The meters from torchnet.meters
-
class
catalyst.utils.meters.meter.
Meter
[source]¶ Bases:
object
Meters provide a way to keep track of important statistics in an online manner.
This class is abstract, but provides a standard interface for all meters to follow.
-
class
catalyst.utils.meters.apmeter.
APMeter
[source]¶ Bases:
catalyst.utils.meters.meter.Meter
The APMeter measures the average precision per class.
The APMeter is designed to operate on NxK Tensors output and target, and optionally a Nx1 Tensor weight where (1) the output contains model output scores for N examples and K classes that ought to be higher when the model is more convinced that the example should be positively labeled, and smaller when the model believes the example should be negatively labeled (for instance, the output of a sigmoid function); (2) the target contains only values 0 (for negative examples) and 1 (for positive examples); and (3) the weight ( > 0) represents weight for each sample.
-
add
(output, target, weight=None)[source]¶ Add a new observation
- Parameters
output (Tensor) – NxK tensor that for each of the N examples indicates the probability of the example belonging to each of the K classes, according to the model. The probabilities should sum to one over all classes
target (Tensor) – binary NxK tensort that encodes which of the K classes are associated with the N-th input (eg: a row [0, 1, 0, 1] indicates that the example is associated with classes 2 and 4)
weight (optional, Tensor) – Nx1 tensor representing the weight for each example (each weight > 0)
-
-
class
catalyst.utils.meters.aucmeter.
AUCMeter
[source]¶ Bases:
catalyst.utils.meters.meter.Meter
The AUCMeter measures the area under the receiver-operating characteristic (ROC) curve for binary classification problems. The area under the curve (AUC) can be interpreted as the probability that, given a randomly selected positive example and a randomly selected negative example, the positive example is assigned a higher score by the classification model than the negative example.
The AUCMeter is designed to operate on one-dimensional Tensors output and target, where (1) the output contains model output scores that ought to be higher when the model is more convinced that the example should be positively labeled, and smaller when the model believes the example should be negatively labeled (for instance, the output of a signoid function); and (2) the target contains only values 0 (for negative examples) and 1 (for positive examples).
-
class
catalyst.utils.meters.confusionmeter.
ConfusionMeter
(k, normalized=False)[source]¶ Bases:
catalyst.utils.meters.meter.Meter
Maintains a confusion matrix for a given calssification problem.
The ConfusionMeter constructs a confusion matrix for a multi-class classification problems. It does not support multi-label, multi-class problems: for such problems, please use MultiLabelConfusionMeter.
- Parameters
k (int) – number of classes in the classification problem
normalized (boolean) – Determines whether or not the confusion matrix is normalized or not
-
add
(predicted, target)[source]¶ Computes the confusion matrix of K x K size where K is no of classes
Args: predicted (tensor): Can be an N x K tensor of predicted scores obtained from the model for N examples and K classes or an N-tensor of integer values between 0 and K-1. target (tensor): Can be a N-tensor of integer values assumed to be integer values between 0 and K-1 or N x K tensor, where targets are assumed to be provided as one-hot vectors
-
class
catalyst.utils.meters.mapmeter.
mAPMeter
[source]¶ Bases:
catalyst.utils.meters.meter.Meter
The mAPMeter measures the mean average precision over all classes.
The mAPMeter is designed to operate on NxK Tensors output and target, and optionally a Nx1 Tensor weight where (1) the output contains model output scores for N examples and K classes that ought to be higher when the model is more convinced that the example should be positively labeled, and smaller when the model believes the example should be negatively labeled (for instance, the output of a sigmoid function); (2) the target contains only values 0 (for negative examples) and 1 (for positive examples); and (3) the weight ( > 0) represents weight for each sample.