Submitted by Infamous_Age_7731 t3_107pcux in deeplearning
I am training a DL model locally and on a VM from a private vendor. Locally I have an RTX 3080 Ti (12Gb), on the cloud for more memory I am using an Ampere A100 (80Gb).
I had the feeling that the VM GPU is a bit slow.
So then I used the exact same hyper-params (i.e. batch size etc) and I noticed again that the local RTX 3080Ti is much faster than the A100. When I checked it was 2-3x faster.
Is that because of the card? (for being a server GPU or I saw that the 80Gb is actually 2x40Gb connected with an NVLink, would that be it?). Or is it a standard practice for VM companies to throttle the GPU?
susoulup t1_j3npxch wrote
i’m not into deep learning so fast as too using cloud gpu and local. I have benchmarked one locally and not sure if i did it properly so i really don’t have any advice. my question is if ECC buffering plays a factor in how the data is processed and stored? I thought that was one advantages of using a workstation gpu but i could be way off.