Submitted by TheButteryNoodle t3_zau0uc in deeplearning
Dexamph t1_iyoebd1 wrote
Reply to comment by computing_professor in GPU Comparisons: RTX 6000 ADA vs A100 80GB vs 2x 4090s by TheButteryNoodle
You certainly can if you put the time and effort into model parallelisation, just not in a seamless way where you get a single big memory pool needing no code changes or debugging to run larger models that wouldn’t fit on one GPU that I and many others were expecting. Notice how most published benchmarks with NVLink have only tested data parallel model training because it’s really straightforward?
computing_professor t1_iyokex9 wrote
Huh. If it requires parallelization then why is the 3090 singled out as the one consumer GeForce card that is capable of memory pooling? It just seems weird. What exactly is memory pooling then, that the 3090 is capable of? I'm clearly confused.
edit: I did find this from Puget that says
> For example, a system with 2x GeForce RTX 3090 GPUs would have 48GB of total VRAM
So it's possible to pool memory with a pair of 3090s. But I'm not sure how it's done in practice.
DingWrong t1_iyq0nr0 wrote
Big models get sharded and chunks get loaded on each gpu. There are a lot of frameworks ready for this as the big NLP models can't fit on a single gpu. Alpa even shards the model on different machines.
computing_professor t1_iyqaku8 wrote
Thanks. So it really isn't the same as how the Quadro cards share vram. That's really confusing.
Dexamph t1_izd1dy7 wrote
This is deadass wrong as that Puget statement was in the context of system memory, nothing to do with pooling: > How much RAM does machine learning and AI need?
>The first rule of thumb is to have at least double the amount of CPU memory as there is total GPU memory in the system. For example, a system with 2x GeForce RTX 3090 GPUs would have 48GB of total VRAM – so the system should be configured with 128GB (96GB would be double, but 128GB is usually the closest configurable amount).
Viewing a single comment thread. View all comments