Viewing a single comment thread. View all comments

FuB4R32 t1_iu41a65 wrote

Is this for training or inference? The easiest thing to do is to split up the batch size between multiple GPUs. If you can't even fit batch=1 on a single GPU though then model parallelism is generally a harder problem

7

sabeansauce OP t1_iu45f4w wrote

for training. Essentially I have to choose between one powerful gpu or multiple average ones. But I know that the average ones on their own don't have enough space (because i have one) for the task at hand. I prefer the one gpu but company is asking if a multi-gpu setup of lesser capabilities will also work if used together.

3

FuB4R32 t1_iu45yrl wrote

Yeah I think I understand, e.g. Google cloud has a great deal on K80 especially if you commit to the costs up front. If you have even a handful of mid GPUs it should be faster training anyway since you can achieve a large batch size, but it depends on the details ofc

3