itsnotmeyou

itsnotmeyou t1_jb2z8z7 wrote

Are you using these as in a system? For just experimenting around, ec2 is good option. But you would either need to install right drivers or use latest deep learning ami. Another option could be using a custom docker setup on sagemaker. I like that setup for inference as it’s super easy to deploy and separates model from inference code. Though it’s costlier and would be available through sagemaker runtime.

Third would be whole over engineering via setting up your own cluster service.

In general if you want to deploy multiple llm quickly go for sagemaker

0