Submitted by sabeansauce t3_yfjfkh in deeplearning
If a task requires a certain amount of available memory on a gpu, and there are two gpus who cannot individually run the task, will the memory of each be combined if ran together? Does it work like that? Or does each gpu have to be capable memory-wise on its own?
FuB4R32 t1_iu41a65 wrote
Is this for training or inference? The easiest thing to do is to split up the batch size between multiple GPUs. If you can't even fit batch=1 on a single GPU though then model parallelism is generally a harder problem