API Reference

class gpuparallel.GPUParallel(device_ids: Optional[List[str]] = None, n_gpu: Optional[Union[int, str]] = None, n_workers_per_gpu=1, init_fn: Optional[Callable] = None, preserve_order=True, progressbar=True, pbar_description=None, ignore_errors=False, debug=False)[source]

Bases: object

__init__(device_ids: Optional[List[str]] = None, n_gpu: Optional[Union[int, str]] = None, n_workers_per_gpu=1, init_fn: Optional[Callable] = None, preserve_order=True, progressbar=True, pbar_description=None, ignore_errors=False, debug=False)[source]

Parallel execution of functions passed to __call__.

Parameters:

device_ids – List of gpu ids to use, e.g. ['cuda:3', 'cuda:4']. The library doesn’t check if GPUs really available, it simply provides consistent worker_id and device_id to both init_fn and task functions.
n_gpu – Number of GPUs to use, shortcut for device_ids=[f'cuda:{i}' for i in range(n_gpu)]. Both parameters n_gpu and device_ids can’t be filled. If neither of them filled, single cuda:0 will be chosen.
n_workers_per_gpu – Number of workers on every GPU.
init_fn – Function which will be called during worker init. Function must have parameters worker_id and device_id (or **kwargs). Helpful to init all common stuff (e.g. neural networks) here.
preserve_order – Return values with the same order as input.
progressbar – Allow to use tqdm progressbar.
ignore_errors – Either ignore errors inside tasks or raise them.
debug – When this parameter is True, parameters n_gpu and device_ids are ignored. Class creates only one worker ([device_id=’cuda:0’]) and run it in the same process (for better debugging).

__del__()[source]: Created pool will be freed only during this destructor. This allows to use __call__ multiple times with the same initialized workers.

__call__(tasks: Iterable[Callable]) → Generator[source]

Function which submits tasks for pool and collects the results of computations.

Parameters:: tasks – List or generator with callable functions to be executed. Functions must have parameters worker_id and device_id (or **kwargs).
Returns:: List of results or generator

class gpuparallel.BatchGPUParallel(task_fn: Callable, batch_size, flat_result=False, *args, **kwargs)[source]

Bases: GPUParallel

__init__(task_fn: Callable, batch_size, flat_result=False, *args, **kwargs)[source]

Parallel execution of task_fn with parameters given to __call__. Tasks are batched: every arg and kwarg turns into list.

Parameters:

task_fn – Task to be executed
batch_size – Batch size
flat_result – Unbatch results. Works only for single tensor output.

__call__(*args, **kwargs) → Generator[source]: All input parameters should have equal first axis to be batched. First arg/kwarg is used to determine size of the dataset. Inputs with other shape (or not Sequence typed) will be copied to every worker without batching. :return: Batched result

gpuparallel.delayed(func)[source]

Decorator used to capture the arguments of a function. Analogue of joblib’s delayed.

Parameters:: func – Function to be captured.

gpuparallel.log_to_stderr(log_level='INFO', force=False)[source]

Shortcut allowing to display logs from workers.

Parameters:

log_level – Set the logging level of this logger.
force – Add handler even there are other handlers already.