Viewing a single comment thread. View all comments

gmork_13 t1_jbbj49n wrote

With fp16/int8 you can probably stick a couple of LLMs of smaller size onto that card.
Have a look around, with fp32 it's about 1B params per 4GB of VRAM. Halve it for fp16 and again for int8 (very roughly).

2