FastestLearner t1_j5mwz47 wrote on January 24, 2023 at 3:14 AM

Reply to comment by ArnoF7 in [D] Multiple Different GPUs? by Maxerature

It is possible, but it would require you to write custom code for every memcopy operation that you want to perform i.e. tensor.to(device), which you can get away with on a smaller project but could become prohibitively cumbersome on a large project. Also you'd still need to do two forward passes (one with the data on the 3080 itself, and then another with the data on the 1080 after having it transferred to the 3080). Whether or not this is beneficial boils down to differences in transfer rates between the RAM-3080 route and the RAM-1080-3080 route. I won't be able to tell which one is faster without benchmarking.

DeepSpeed handles the RAM-3080 to-and-fro transfers for large batch sizes automatically.

ArnoF7 t1_j5omrfh wrote on January 24, 2023 at 2:22 PM

Great insight. Appreciate it