halixness t1_j9e80y1 wrote on February 21, 2023 at 7:22 AM Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics So far I have tried BLOOM Petals (a distributed LLM), inference took me around 30s for a single prompt on a 8GB VRAM gpu, but not bad! Permalink 1
halixness t1_j9e80y1 wrote
Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
So far I have tried BLOOM Petals (a distributed LLM), inference took me around 30s for a single prompt on a 8GB VRAM gpu, but not bad!