thundergolfer
thundergolfer t1_j8b2g6w wrote
> If you're deploying models to production
Airflow is not a good tool for ML development. Leave Airflow back in 2018. Also Modal can do prod model deployment, model pipelines, and inference.
thundergolfer t1_izol4cw wrote
Reply to comment by gyurisc in [D] Cloud providers for hobby use by gyurisc
Please do! DM me your email and I'll approve your account.
thundergolfer t1_izj8k0i wrote
Reply to comment by gyurisc in [D] Cloud providers for hobby use by gyurisc
no, you won't be able to make requests to the notebook.
thundergolfer t1_izgyu6x wrote
Reply to comment by RSchaeffer in [D] Workflows for quickly iterating over ideas without free access to super computers by [deleted]
They're not actually links, they've just been formatted like they are. They just link to train.py
which is not a website.
thundergolfer t1_izgiaa4 wrote
Reply to [D] Workflows for quickly iterating over ideas without free access to super computers by [deleted]
I'm sorry to shill, but Modal.com is easily the best thing for this. Here's a demo video should how fast you can edit code, run it in the cloud, and then edit it some more all in a handful of seconds.
I was the ML Platform lead at Canva and quick iteration was the #1 pain point of our data scientists and MLEs. I left Canva to join Modal because it can do heavy serverless compute and keep your inner dev loop tight.
Again, sorry to shill, but I've been in this sub for like 8 years and think tools like Modal and Metaflow are finally getting us to a place where ML development isn't a painful mess.
thundergolfer t1_iyocw2v wrote
Reply to comment by t0mkaka in [Project] I used whisper to transcribe 2500 episodes from around 80 podcasts and made it searchable. by t0mkaka
Thanks for the details.
> one by Assembly who demoed with parallelized.
What was this demo? Got a link?
thundergolfer t1_iykjcgc wrote
Reply to [Project] I used whisper to transcribe 2500 episodes from around 80 podcasts and made it searchable. by t0mkaka
How do you deploy and scale the transcription? Is it on GPU, which model variant?
I also built a whisper transcription app: modal.com/docs/guide/whisper-transcriber. It can do serverless CPU transcription on-demand. You can check it out and borrow from it if it's useful. The code is open-source.r
PS.: Yes this does violate rule 5 (promote on weekends). I violated the same rule when I posted my whisper app :)
thundergolfer t1_iyk3y2p wrote
Reply to comment by Cheap_Meeting in [D] Cloud providers for hobby use by gyurisc
You can't deploy Colab, only link people to the notebook?
thundergolfer t1_iyk3df7 wrote
Reply to [D] Cloud providers for hobby use by gyurisc
Try modal.com.
Modal is an ML-focused serverless cloud, and much more general than replicate.com which just allows you to deploy ML model endpoints. But still extremely easy to use.
It's the platform that this openai/whisper podcast transcriber is built on: /r/MachineLearning/comments/ynz4m1/p_transcribe_any_podcast_episode_in_just_1_minute/.
Or here's an example of doing serverless batch inference: modal.com/docs/guide/ex/batch_inference_using_huggingface.
This example from Charles Frye runs Stable Diffusion Dream Studio on Modal: twitter.com/charles_irl/status/1594732453809340416
thundergolfer t1_ixalc2h wrote
Reply to comment by Dense_History_1786 in [D]deploy stable diffusion by Dense_History_1786
If doesn't suit, lmk what didn't work well. Otherwise, I think other serverless GPU platforms will be your best bet. I don't think GCP do serverless GPUs and although AWS Sagemaker supports it their UX makes development a big pain.
thundergolfer t1_ix9zysc wrote
Reply to [D]deploy stable diffusion by Dense_History_1786
> How can I deploy it so that its scalable ?
There's no such general thing as "scalability" (AKA magic scaling sauce). You'll have to be a lot more specific about how your deployment is not handling changes in load parameters.
If I had to guess, I'd say the likely scaling issue is going from a single VM with a single GPU to N
GPUs able to run inference in parallel.
If that is your main scaling issue, modal.com can do serverless GPU training/inference against N
GPUs almost trivially: twitter.com/charles_irl/status/1594732453809340416.
(disclaimer: work for modal)
thundergolfer t1_j8bjgpu wrote
Reply to comment by PHEEEEELLLLLEEEEP in [D] What ML dev tools do you wish you'd discovered earlier? by TikkunCreation
If you don't have issues then definitely don't bother migrating! Something like Metaflow or Modal is much more built for purpose. Airflow was designed for the Hadoop era of data engineering; it's straining under changes that have happened in the Python, container, and ML ecosystems.