amigo213a

amigo213a t1_is9mxyh wrote

I do MLOps on daily basis and have setup something from scratch in my company. The only thing I have to say is that, the most popular ones available out there are not going to help. Take Kubeflow for example, you need to hands on experience with Kubernetes to be able to setup good workflows/pipelines but most of the users who would be working are your Data Scientists/Machine Learning experts who wouldn't be any expert. They hardly build solution that scales as well. So it comes down to the MLOps platform to be able to meet with their weird requirements.

Choosing Kubernetes is a good started point, it lets you scale out, run workloads in isolation and many great things. Either you could setup your own infra in the company or choose one of the managed clusters from AWS/GCP/Azure depending on their pricing. Only good thing about cloud providers is that you don't need to take care of the infra on your side. Like for e.g, if you want to spin-off your own Text-to-Image service then you could easily containerize push the solution onto different region based kubernetes cluster on AWS or other cloud. You can easily get CDN for scaling the serving based on regions easily on AWS.

12