I'm not planning on using any of our data or touching the infrastructure yet, but for some reason I never considered using the cloud to determine hardware configuration.
Thanks for the response. Do you recall where you read the "only 200 people" bit? I'll take a look around for it as well; seems like the context could have found itself surrounded by interesting conversation.
P2P is not so much of a limitation so long as you can fit the entire model / pipeline into a single cards VRAM though, correct?
So for example, if you have a 7B Param model at FP16 and its around 14GB, presumably you should be safe with 24GB VRAM?
ChristmasInOct OP t1_jb2enle wrote
Reply to comment by Appropriate_Ant_4629 in LLaMA model parallelization and server configuration by ChristmasInOct
I really appreciate this response.
I'm not planning on using any of our data or touching the infrastructure yet, but for some reason I never considered using the cloud to determine hardware configuration.
Thanks again. Exactly what I needed!