Viewing a single comment thread. View all comments

Quick-Hovercraft-997 t1_jbx9gcj wrote on March 12, 2023 at 12:52 PM

if latency is not a critical requirement, you can try serverless GPU cloud like banana.dev, pipeline.ai . These platform provide an easy to use template for deploying LLM.