Viewing a single comment thread. View all comments

computing_professor t1_iyokex9 wrote

Huh. If it requires parallelization then why is the 3090 singled out as the one consumer GeForce card that is capable of memory pooling? It just seems weird. What exactly is memory pooling then, that the 3090 is capable of? I'm clearly confused.

edit: I did find this from Puget that says

> For example, a system with 2x GeForce RTX 3090 GPUs would have 48GB of total VRAM

So it's possible to pool memory with a pair of 3090s. But I'm not sure how it's done in practice.

0

DingWrong t1_iyq0nr0 wrote

Big models get sharded and chunks get loaded on each gpu. There are a lot of frameworks ready for this as the big NLP models can't fit on a single gpu. Alpa even shards the model on different machines.

3

computing_professor t1_iyqaku8 wrote

Thanks. So it really isn't the same as how the Quadro cards share vram. That's really confusing.

1

Dexamph t1_izd1dy7 wrote

This is deadass wrong as that Puget statement was in the context of system memory, nothing to do with pooling: > How much RAM does machine learning and AI need?

>The first rule of thumb is to have at least double the amount of CPU memory as there is total GPU memory in the system. For example, a system with 2x GeForce RTX 3090 GPUs would have 48GB of total VRAM – so the system should be configured with 128GB (96GB would be double, but 128GB is usually the closest configurable amount).

1