Viewing a single comment thread. View all comments

djmaxm t1_jd05tgt wrote

I have a 4090 with 32GB of system RAM, but I am unable to run the 30B model because it exhausts the system memory and crashes. Is this expected? Do I need a bunch more RAM? Or am I doing something dumb and running the wrong model. I don't understand how the torrent model, the huggingface model, and the .pt file relate to each other...

3

rikiiyer t1_jddanig wrote

The 30B params of the model are going onto your GPUs VRAM (which should be 24GB), which is causing the issue. You can try loading the model in 8bit which could reduce size

1