itsnotmeyou t1_jb2z8z7 wrote on March 6, 2023 at 1:08 AM

Are you using these as in a system? For just experimenting around, ec2 is good option. But you would either need to install right drivers or use latest deep learning ami. Another option could be using a custom docker setup on sagemaker. I like that setup for inference as it’s super easy to deploy and separates model from inference code. Though it’s costlier and would be available through sagemaker runtime.

Third would be whole over engineering via setting up your own cluster service.

In general if you want to deploy multiple llm quickly go for sagemaker

itsnotmeyou t1_jb2zfbq wrote on March 6, 2023 at 1:09 AM

On a side note sagemaker was not supporting shm-size so might not work for large lm