Submitted by xyrlor t3_ymoqah in deeplearning

With the announcement of the new AMD GPUs, I've gotten curious if they're an option for deep learning. They offer a lot for gaming but I'm not sure if they're good for deep learning. For example, Pytorch offers ROCm 5.2 for amd, but how is the performance? would I be better off looking for a higher tier nvidia 3000 series than the new amd gpus?

23

Comments

You must log in or register to comment.

fjodpod t1_iv52eke wrote

Is it possible?

Yes, but you probably need newer Linux distros and some basic Linux knowledge.

Do I do it myself?

Yes in pytorch with a 6600, but it was a bit annoying to set up with some errors, however now it just works (haven't benchmarks it yet).

Do I recommend it for the average user?

No, you should only do it if you suddenly want to do machine learning but you're stuck with an amd card.

If you haven't bought a gpu yet and you consider doing machine learning avoid the setup hassle and just pay a bit more for Nvidia gpus. 3060 12GB is a good value graphics card for machine learning

14

Hiant t1_iv5adoh wrote

not for mortals

6

xyrlor OP t1_iv5axmk wrote

Thanks! I’m currently running a 3070, but have some deep learning unrelated errors so I’m looking around for options while I send my card in for repairs. Since new gpus are announced from both Nvidia and AMD, I was curious about the perspective on both gaming and deep learning for side projects.

3

GoodNeighborhood1017 t1_iv60m0z wrote

Check out Tensorflow DirectML and PyTorch DirectML Works pretty well on amd gpus

3

kmanchel t1_iv7dix3 wrote

ROCm is much less mature of a deep learning stack than what nvidia has (by atleast 5 years). However your choice depends on what your scope of usage is, and if you’re willing to trade off usability for cost (I’m assuming amd hardware is significantly cheaper).

3

Hamster729 t1_iv7swx8 wrote

Absolutely. In fact, you typically get more DL performance per $USD with AMD GPUs, than with NVIDIA.

However, there are caveats:

  1. The primary target scenario for ROCm is Linux + docker container + gfx9 server SKUs (Radeon Instinct MIxxx). The further you move from this optimal target, the more uncertain things become. You can install the whole thing directly into your Ubuntu system, or, if you really want to waste lots of time, to compile everything from source, but it is best to install just the kernel-mode driver, and then do "docker run --privileged" to pull a complete VM with every package already in place. I am not sure what the situation is with Windows support. Support of consumer grade GPUs usually comes with some delay. E.g. Navi 21 support was only "officially" added last winter. The new chips announced last week may not be officially supported for months after they hit the shelves.
  2. You occasionally run into third party packages that expect CUDA and only CUDA. I just had to go through the process of hacking pytorch3d (the visualization package from FB) because it had issues with it.
1

fjodpod t1_ivk6pbg wrote

Personally i would either wait for the 4000 series midtier cards or just buy a 3000 series card with enough vRAM. However keep in mind that the 4000 Series theoretically could be worse or on par for machine learning than the 3000 series in some cases due to lower memory throughput: (https://www.reddit.com/r/MachineLearning/comments/xjt129/comment/ipb6p8y/?utm_source=share&utm_medium=web2x&context=3)

1

xyrlor OP t1_ivlicka wrote

That's what I'm currently considering too. But I'm not optimistic about mid tier card prices, considering how the 4080 and 4090 are priced.

1