Shortcuts

Engines

You could check engines overview under examples/engines section.

CPUEngine

class catalyst.engines.torch.CPUEngine(*args, **kwargs)[source]

Bases: catalyst.core.engine.Engine

CPU-based engine.

GPUEngine

class catalyst.engines.torch.GPUEngine(*args, **kwargs)[source]

Bases: catalyst.core.engine.Engine

Single-GPU-based engine.

DataParallelEngine

class catalyst.engines.torch.DataParallelEngine(*args, **kwargs)[source]

Bases: catalyst.core.engine.Engine

Multi-GPU-based engine.

DistributedDataParallelEngine

class catalyst.engines.torch.DistributedDataParallelEngine(*args, address: str = '127.0.0.1', port: Union[str, int] = 2112, world_size: Optional[int] = None, workers_dist_rank: int = 0, num_node_workers: Optional[int] = None, process_group_kwargs: Optional[Dict[str, Any]] = None, **kwargs)[source]

Bases: catalyst.core.engine.Engine

Distributed multi-GPU-based engine.

Parameters
  • *args – args for Accelerator.__init__

  • address – master node (rank 0)’s address, should be either the IP address or the hostname of node 0, for single node multi-proc training, can simply be 127.0.0.1

  • port – master node (rank 0)’s free port that needs to be used for communication during distributed training

  • world_size – the number of processes to use for distributed training. Should be less or equal to the number of GPUs

  • workers_dist_rank – the rank of the first process to run on the node. It should be a number between number of initialized processes and world_size - 1, the other processes on the node wiil have ranks # of initialized processes + 1, # of initialized processes + 2, …, # of initialized processes + num_node_workers - 1

  • num_node_workers – the number of processes to launch on the node. For GPU training, this is recommended to be set to the number of GPUs on the current node so that each process can be bound to a single GPU

  • process_group_kwargs – parameters for torch.distributed.init_process_group. More info here: https://pytorch.org/docs/stable/distributed.html#torch.distributed.init_process_group # noqa: E501, W505

  • **kwargs – kwargs for Accelerator.__init__

DistributedXLAEngine

class catalyst.engines.torch.DistributedXLAEngine(*args, **kwargs)[source]

Bases: catalyst.core.engine.Engine

Distributed XLA-based engine.