Viewing a single comment thread. View all comments

NotARedditUser3 t1_jcsc9lp wrote on March 19, 2023 at 4:31 AM

You can get llama running on consumer grade hardware. There's 4 and 8 bit quantization for it i believe where it fits in a normal gpu's vram, i saw floating around here