Viewing a single comment thread. View all comments

ggf31416 t1_jdesxc0 wrote

With memory offloading and 8-bit quantization you may be able to run the 13B model, but slowly. The 7B will be faster.

1