[D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM Submitted by head_robotics t3_1172jrs on February 20, 2023 at 9:33 AM in MachineLearning 51 comments 220
CommunismDoesntWork t1_j9b1qjb wrote on February 20, 2023 at 4:51 PM I'm surprised pytorch doesn't have an option to load models partially in a just in time basis yet. That way even an infinitely large model can be infered on. Permalink 7
Viewing a single comment thread. View all comments