Final-Rush759 t1_jb5bxf5 wrote on March 6, 2023 at 3:36 PM

Reply to comment by incrediblediy in Should I choose Colab or RTX3070 for deep learning? by Cyp9715

Only 2×more than 3060. May be you are more power limited or CPU bottle necked when using both GPUs, or PCEi bandwidth limited.

incrediblediy t1_jb5dzqa wrote on March 6, 2023 at 3:49 PM

This is when they were running individually on full 16x PCIE 4.0, can be expected with TFLOPS (3x) as well. (i.e. I have compared times when I had only 3060 vs 3090 on the same slot, running model on a single GPU each time)

I don't do much training on 3060 now, just connected to monitors etc.

I have changed the batch sizes to suit 24 GB anyway as I am working with CV data. Could be bit different with other types of models.

3060 = FP32 (float) 12.74 TFLOPS (https://www.techpowerup.com/gpu-specs/geforce-rtx-3060.c3682)
3090 = FP32 (float) 35.58 TFLOPS (https://www.techpowerup.com/gpu-specs/geforce-rtx-3090.c3622)

I must say 3060 is a wonderful card and helped me a lot until I found this ex-mining 3090. Really worth for the price with 12 GB VRAM.

Final-Rush759 t1_jb5f7eu wrote on March 6, 2023 at 3:58 PM

I used mix precision training, should have been largely fp16. But you can input as float32. Pytorch amp will auto cast to fp16. I only get 2x speed more with 3090.

Final-Rush759 t1_jb5iho5 wrote on March 6, 2023 at 4:20 PM

2.9x tensor cores , 2.8x cuda cores.