TheButteryNoodle

TheButteryNoodle OP t1_iyo7url wrote

Right. Model parallelization was one of my concerns with any type of dual GPU setup as it can be a hassle at times and isn't always suitable for all models/use cases.

As for the A100, the general plan was to purchase a card that still has Nvidia's Manufacturer Warranty active (albeit that may be a bit tough at that price point). If there is any type of extended warranty that I could purchase, whether it's from Nvidia or a reputable third party, I would definitely be looking into those. In general, if the A100 was the route I would be going, there would be some level of protection purchased, even if it costs a little bit more.

As for the cooling, you're right... that is another pain point to consider. The case that I currently have is a fractal design torrent. In this case I have 2 180mm fans in the front, 3 140mm fans at the bottom, and then a 120mm exhaust fan at the back. I would hope that these fans alongside an initial blower fan setup would provide sufficient airflow. However, if it doesn't, I would likely move again to custom water cooling.

What I'm not sure though is how close the performance of the RTX 6000 ADA comes to an A100. If the performance difference isn't ridiculous for fp16 and fp32, then it would likely make sense to lean toward the 6000. Also, there is the fp8 performance for the 6000 with CUDA 12 being right around the corner.

2

TheButteryNoodle OP t1_iynr0g4 wrote

Hey there! Thanks for the response! I'm a little bit of a novice when it comes to how nvlink works, but wouldn't you still need to use model parallelization to fit a model over 24GB with 2x 3090s connected via nvlink? I thought they would effectively still show as two different devices, similar to 2 4090s; of course the benefit here being that the nvlink bridge directly connects the two gpus instead of going over pcie. Not too knowledgeable about this, so please feel free to correct me if I'm wrong!

1