Submitted by Character-Ad9862 t3_z5y7nr in deeplearning

Hey,

im currently looking for a computer mainly used for deep learning computer vision tasks (2D image data). It's for a company im starting to work for. Im the first computer vision engineer there and plans are to have additional employees over time. The computer will be used to train deep learning architectures very often and occasionally also has to process 3D point cloud data. Budget is at a level of 10-12k$.

Questions:

- Whats better for my purpose: Buying a workstation or just use some service like AWS?

- I've looked up the graphics card RTX A6000 with 48GB GDDR6 which seems to be a good fit for my budget, considering its prize of around 5k$. Is there a new generation of nvidia graphics in sight that would be worth waiting for?

- As for the CPU it doesn't need to be a high end product but it shouldn't be too weak (bottleneck) either. Any suggestions here?

Thanks in advance!

11

Comments

You must log in or register to comment.

sweeetscience t1_iy0hh50 wrote

Get a workstation. We used GCP/Vertex to do batch prediction on a computer vision model, but for larger videos it inexplicably fails. Google has spent 6 weeks now trying to figure out why it doesn’t work (everyone, including Google engineers, are in agreement that the model container is not the problem). They still don’t have an answer.

We ended up investing in building our own multi-GPU server and not only are our prediction times better, but we can instantly see and diagnose issues that arise.

One of the often overlooked aspects of using public clouds is that there are several layers of abstraction that remove you from what’s happening under the hood. If something happens behind the scenes that you can’t readily diagnose and fix yourself, you’re basically at the mercy of AWS et al to provide you with an answer.

For 10-12k, you can get a handful of high end consumer cards and a boatload of memory, and you have full control of the system.

8

Character-Ad9862 OP t1_iy0n2uw wrote

Really appreciate your insights. Having that extra dependency layer is something that has also worried me.

3

VinnyVeritas t1_ixzu8ny wrote

Those $10,000 won't last long on AWS. There's also LambdaLabs, their cloud prices are a lot more affordable. They also make dedicated servers for machine learning.

7

b0untyk1ll3r t1_iy029yi wrote

A bare 128gb 16cpu, 2gpu instance, using reserve instances would be $800/month and that's without any bells and whistles (like an EBS volume or data transfer) so I think you would last less than a year.

Given that hardware should last you a couple of years (hopefully), it's a better way to spend your money than EC2.

4

chaplin2 t1_ixzr04d wrote

Setting up a machine with the right GPU and drivers and CPU and cooling can be a PIA, and takes time!

3

CKtalon t1_iy1mmlw wrote

A6000 is almost 2 years old. The newer version the RTX 6000 (yes confusing naming convention) is coming out in about 3 months time, although it might not be easy to get your hands on one.

3

Character-Ad9862 OP t1_iy3ccby wrote

Yea im a bit worried the graphics card could be a little out of date as well. However, the RTX 6000 most likely will have a significantly higher prize shield which might be too much considering my budget. Is there any alternative card out there that could meet my requirements?

1

BoiElroy t1_ixzz4z3 wrote

Honestly lookup Paperspace Gradient and consider their monthly service. They have a tier where you can quite routinely get decent free GPUs, which honestly when you're just working up code and refactoring and making sure a training run is actually going to run then it's perfect for that. Then when you're ready to let something run overnight then you select an or whatever A6000 and it's reasonably priced.

2

Rishh3112 t1_iy3dnqd wrote

Hey, I work for a start-up and the ai models are trained in a AWS cloud system. I suggest on having a AWS server since the company will be requiring host service later on, and it's easier to hold a cloud server and manage it. The security system of AWS is pretty good and hosting APIs is a lot easier with a cloud server. Even training model is a lot quicker since building a AWS standard system will be quite expensive and not just the system cost but the power consumption of the system will also be high. When considering about the cost between a AWS system and a physical system in office the factor of power consumption negates the monthly cost of a cloud system. Also a cloud system could be used on any system from around the world but for a physical system in office, you would require to setup a VPN to access and the system needs to be on power to use it whenever you want. AWS server will charge only while you are using the server with a minimal monthly charge. In comparison a physical system will be initially expensive and the cost of electricity, vpn and systems and room for cooling will cost you more than a AWS server in the longer run. Hope you found this helpful.

1