Submitted by Nerveregenerator t3_z0msvy in deeplearning

Ok, so im considering upgrading my deep learning PC. Im currently using a 1080Ti. From my perspective, it is still a relatively solid card, and can be picked up on ebay for 200 bucks. So my question is, would I be better off using 4 1080Ti's or 1 3090? These should be reasonably similar in price. Also, im aware I will need a cpu that can handle this, so I suppose if you guys have any suggestions on a motherboard and cpu that can keep 4 1080s full of tensors, that would be helpful too. I cant seem to find a straight answer on why this setup isn't more popular, because the cost/performance ratio for the 1080's seems great..

Thanks

​

​

EDIT

- so sounds like a 3090 will be the best move to avoid complexities associated with multiple GPUs. What do you guys think if there was a pip package that allowed you to benchmark your setup for deep learning and then you could compare results to other users? Would that be something you would be interested in?

1

Comments

You must log in or register to comment.

Star-Bandit t1_ix6l9wf wrote

You might also check some old server stuff, I have a Dell R720 running two Tesla K80's which is essentially the equivalent of 2 1080s per card. While it may not be the latest and greatest, the server ran me $300 and the two cards ran me $160 from eBay.

3

scraper01 t1_ix6t386 wrote

Four 1080 ti will get you the performance of a single 3090 if you are not using mixed precision. Once tensor cores are enabled, difference is night and day. Training and inference, a single 3090 will blow your multi GPU rig out of the water. On top of that, you'll need a motherboard plus a CPU with lots of PCIE lanes, and those ain't cheap. Pro grade stuff with enough lanes will be north of 10k. Not worth it.

11

Nerveregenerator OP t1_ix75olf wrote

So I did some research. According to the lambda labs website, 4 1080s combined will get me 1.5x throughout as a 3090 with FP32 training. FP16 seems to yield a 1.5x speed up for the 3090 for training. So even with mixed precision, it comes out to be the same. The actual configuration of 4 cards is not something I’m very familiar with, but I wanted to point this out as it seems like NVIDIA has really bullshitted a lot with their marketing. A lot of the numbers they throw around just don’t translate to ML.

2

Star-Bandit t1_ix7anbd wrote

No, each K80 is about equal to 2 1080ti, if you look at the cards they each have two chip sets and about 12Gb of RAM to each chip, 24Gb total vram per card. But the issue is they get hot, when running a training model on them it can sit around 70°c. But it's nice to be able to assign each chip set to different tasking.

0

Star-Bandit t1_ix7avv6 wrote

Actually after going back over the data from the numbers perspective regarding the two cards (bandwidth, clock speed etc,) the 1080 Ti certainly might have the upper hand, I'd have to run some benchmarks myself

1

incrediblediy t1_ix7cqrg wrote

> 4 1080Ti's or 1 3090 > ebay for 200 bucks

you can also get an used 3090 for the same price of 4*200, also you can use 24 GB VRAM for training larger models

5

incrediblediy t1_ix7czdr wrote

> 4 1080s combined will get me 1.5x throughout as a 3090 with FP32 training. FP16 seems to yield a 1.5x speed up for the 3090 for training.

I think that's when only comparing CUDA cores without Tensor cores, anyway you can't merge VRAM together for large models

3

RichardBJ1 t1_ix7ficv wrote

I think if you get even similar performance with one card versus 4 cards the former is going to be far less complex to set up!? Just the logistics of that sounds a nightmare.

2

chatterbox272 t1_ix7mx5j wrote

>the cost/performance ratio for the 1080's seems great..

Only if your time is worthless, your ongoing running costs can be ignored, and expected lifespan is unimportant.

Multi-GPU instantly adds a significant amount of complexity that needs to be managed. It's not easy to just "hack it out" and have it work under multi-GPU, you either need to use frameworks that provide support (and make sure nothing you want to do will break that support), or you need to write it yourself. This is time and effort you have to spend that you otherwise wouldn't with a single GPU. You'll have limitations with respect to larger models, as breaking up a model over multiple GPUs (model parallelism) is way more complicated than breaking up batches (data parallelism). So models >11GB for a single element are going to be impractical.

You'll have reduced throughput unless you have a server, since even HEDT platforms are unlikely to give you 4 PCIe Gen3 x16 slots. You'll be on x8 slots at best, and most likely on x4 slots. You're going to be pinned to much higher end parts here, spending more on the motherboard/cpu than you would need to for a single 3090.

It's also inefficient as all buggery. The 3090 has a TDP of 350W, the 1080Ti has 250W. That means for the same compute you're drawing roughly (TDP is a reasonable but imperfect stand in for true power draw) 3x the power for that compute. That will drastically increase the running cost of the system. Also a more expensive power supply and possibly even needing to upgrade the wall socket to allow you to draw that much power (4 1080Ti to me means a 1500W PSU minimum, which would require a special 15A socket in Australia where I live).

You're also buying cards that are minimum 3 years old. They have seen some amount of use, and use in a time where GPU mining was a big deal (so many of the cards out there were pushed hard doing that). The longer a GPU has been out of your possession, the less you can rely on how well it was kept. The older arch will also be sooner dropped for support. Kepler was discontinued last year, so we have Maxwell and then Pascal (where the 10 series lies). Probably a while away, but a good bit sooner than Ampere (which has to wait through Maxwell, Pascal, Volta, and Turing before it hits the chopping block).

TL; DR:
Pros: Possibly slightly cheaper upfront
Cons: Requires more expensive hardware to run, higher running cost, shorter expected lifespan, added multi-GPU complexity, may not actually be compatible with your wall power.

TL; DR was TL; DR: Bad idea, don't do it.

7

Dexamph t1_ix7onhf wrote

I think you way overestimated K80 performance when my 4GB GTX 960 back in the day could trade blows with a K40, which was a bit more than half a K80. In a straight memory bandwidth fight, like Transformer model training, the 1080Ti is going to win hands down even if you have perfect scaling acorss both GPUs on the K80 and that's assuming it doesn't get hamstrung by the ancient Kepler architecture in any way at all.

2

AmazingKitten t1_ix7w3vr wrote

3090 is better. Single gpu training is easier and it will consume less power. Plus, you can still add another one later.

3

Nerveregenerator OP t1_ix92czu wrote

Ok, thanks I think that clears up the drawbacks. I’d have to check which motherboard I’m using now, but generally would you expect a 3090 to be compatible with the motherboard that works with a 1080ti? Thanks

1

Star-Bandit t1_ix9toom wrote

Interesting, to I'll have to look into the specs of the M40, have you had any issues with running out of space with vram? All my models seem to gobble it up, though I've done almost no optimizations since I've just recently gotten into ML stuff

2

incrediblediy t1_ix9xbce wrote

if your CPU/Motherboard support PCIe 4.0 16x slot, that is all needed for a RTX3090. I have a 5600x with cheap B550M-DS3H motherboard running RTX3090 + RTX3060. I also got an used RTX3090 from ebay after decline of mining. Just make sure your PSU can support it, draws 370 W at max.

2

C0demunkee t1_ixd9fdq wrote

yeah you can easily use it all up from both image scale and batch size. Also some models are a bit heavy and don't leave any for the actual generation.

Try "pruned" models, they are smaller.

since the training sets are all on 512x512 images it makes the most sense to generate at that res and then upscale.

1