Viewing a single comment thread. View all comments

NotARedditUser3 t1_jcsc9lp wrote

You can get llama running on consumer grade hardware. There's 4 and 8 bit quantization for it i believe where it fits in a normal gpu's vram, i saw floating around here

1