Submitted by zveroboy152 t3_102n6qp in MachineLearning

Greetings!

​

In my adventures of Pytorch, and supporting ML workloads in my day to day job, I wanted to continue homelabbing and buildout a compute node to run ML benchmarks and jobs on.

​

This brought me to the AMD MI25, and for $100 USD it was surprising what amount of horsepower, and vRAM you could get for the price. Hopefully my write up will help someone in the machine learning community.

​

Let me know if you have any questions or need any help with a GPU compute setup. I'd be happy to assist!

​

https://www.zb-c.tech/2022/11/20/amd-instinct-mi25-machine-learning-setup-on-the-cheap/

36

Comments

You must log in or register to comment.

gradientpenalty t1_j2u9byl wrote

Do you have any benchmarks to share? Would be very nice if this is available

3

currentscurrents t1_j2uul25 wrote

Interesting how that card went from $15k to $100 in the space of five years.

I'm holding out hope the A100 will do the same once it's a couple generations old.

3

SnooHesitations8849 t1_j2uw5u7 wrote

If AMD was good at providing a good driver this would be a game changer for beginner. LoL.

1

zveroboy152 OP t1_j2v9h5g wrote

The driver is a bit akward to deal with, but isn't terrible. Working inside of a docker container for GPU workloads isn't terrible either if you know your way around in containerization.

But, I do agree. NVIDIA's driver's are easier to deal with (Referencing my K80 Driver install experience: https://www.zb-c.tech/2022/12/11/how-to-install-drivers-on-ubuntu-for-the-nvidia-tesla-k80/ )

5

5death2moderation t1_j2wrwrl wrote

Tesla m40s and now p100s were 200 dollars a piece just four years after release. V100s have not depreciated as quickly though, presumably because the tensor cores keep their performance competitive. I would assume a100s will suffer the same fate of being very expensive for many years to come sadly.

6

CrashTimeV t1_j3zt9sk wrote

Are you the person from craft computing’s discord who is running stable diffusion on his MI25?

2

CrashTimeV t1_j41qigp wrote

Let me know if you figure out gpu direct storage. I just got 2x P100s and a R730 for my ML rig later found out my ssds were not the correct ones so waiting for the new ones to arrive. Can’t wait to integrate this into my lab and workflow

1