Submitted by DevarshTare t3_11725n6 in MachineLearning
[removed]
Submitted by DevarshTare t3_11725n6 in MachineLearning
[removed]
Thanks a lot!
https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/
https://lambdalabs.com/gpu-benchmarks
How much VRAM you need will depend mostly on the number of parameters of the model with some extra for the data. At FP32 precision each parameter needs 4 bytes, at FP16 or BF16 2 bytes, and at FP8 or INT8 only one byte. Almost all models can be run at FP16 without noticeable accuracy loss, FP8 sometimes works, sometimes it doesn't depending on the model.
Appreciate it! This gave me a better picture. I was stuck between 3060 ti and 3070. In this case 3060 ti is the logical option. I will be using the Colab for training it, and can probably optimise it to run with 8 Gb, if I'm not wrong?
3070 and 3060ti both have 8GB, and while the 3070 will be a bit faster, most people will agree that the difference is not worth the price if you have a tight budget.
For training the extra 4GB from the plain 3060 is quite useful, but for inference only you can run most small and medium models (such as stable diffusion) in 8GB and the 3060ti will be faster.
I'm using a 3060 (no ti) with 12GB VRAM and train locally as well. Performance is fine, too.
Thats interesting. I was considering that purchase since it makes sense to run larger datasets or models on the rtx 3060. But the Tensor cores were significantly lower. The GPU would definitely run much larger models but at a lower speed I assume?
How has your experience been with larger models? Especially video or image based models ?
I actually have a 3060 too, in theory a 3060ti should be up to 30% faster, but most of the times the 3060 is fast enough and faster than any T4.
For making a few images on stable diffusion maybe the difference will be 15 vs 20 seconds, for running whisper on several hours of audio it could be 45 minutes vs 1 hour. The difference will only matter if the model is optimized to fully use the GPU in the first place.
I've seen the same across multiple threads now, the VRAM does make a difference in being able to run a model or having to optimize it. This has been really helpful, thanks a lot guys!
TruthAndDiscipline t1_j99vabp wrote
VRAM has no effect on speed, but if you don't have enough to load model and data, you can't train (CUDA out of memory error).
For performance just look for performance charts.