Quickstart
Basic usage
Calc squares of numbers:
1import torch
2from gpuparallel import GPUParallel, delayed
3
4def perform(idx, gpu_id, **kwargs):
5 tensor = torch.Tensor([idx]).to(gpu_id)
6 return (tensor * tensor).item()
7
8result = GPUParallel(n_gpu=2)(delayed(perform)(idx) for idx in range(5))
9print(sorted(result)) # [0.0, 1.0, 4.0, 9.0, 16.0]
Initialize networks on worker init
Function init_fn
is called on init of every worker. All common resources (e.g. networks) can be initialized here.
In this example we create 32 workers on 16 GPUs, init model
when workers are starting and then reuse workers for several batches of tasks:
1from gpuparallel import GPUParallel, delayed
2
3def init(gpu_id=None, **kwargs):
4 global model
5 model = load_model().to(gpu_id)
6
7def perform(img, gpu_id=None, **kwargs):
8 global model
9 return model(img.to(gpu_id))
10
11gp = GPUParallel(n_gpu=16, n_workers_per_gpu=2, init_fn=init)
12results = gp(delayed(perform)(img) for img in fnames)
Reuse initialized workers
Once workers are initialized, they keep live until GPUParallel
object exist.
You can perform several queues of tasks without reinitializing worker resources:
1gp = GPUParallel(n_gpu=16, n_workers_per_gpu=2, init_fn=init)
2overall_results = []
3for folder_images in folders:
4 folder_results = gp(delayed(perform)(img) for img in folder_images)
5 overall_results.extend(folder_results)
6del gp # this will close process pool to free memory
Simple logging from workers
Use log_to_stderr()
call to init logging, and log.info(message)
to log info from workers:
1from gpuparallel import GPUParallel, delayed, log_to_stderr, log
2
3log_to_stderr()
4
5def perform(idx, worker_id=None, gpu_id=None):
6 hi = f'Hello world #{idx} from worker #{worker_id} with GPU#{gpu_id}!'
7 log.info(hi)
8
9GPUParallel(n_gpu=2)(delayed(perform)(idx) for idx in range(2))
It will return:
[INFO/Worker-1(GPU1)]:Hello world #1 from worker #1 with GPU#1!
[INFO/Worker-0(GPU0)]:Hello world #0 from worker #0 with GPU#0!