Viewing a single comment thread. View all comments

nutpeabutter t1_iu3v2bd wrote on October 28, 2022 at 11:00 AM

There is currently no easy way of pooling vram. If the model can't fit onto vram I suggest you check out https://huggingface.co/transformers/v4.9.2/parallelism.html#tensor-parallelism.

sabeansauce OP t1_iu45vee wrote on October 28, 2022 at 12:46 PM

that is a good intro on the topic I bookmarked the paper they referenced. Good to know I have this in the toolbox thank you