JohnLawsCarriage t1_j9wcxns wrote

A big NVIDIA card? You'll need at the very least 8, and even still you're not coming close to something like ChatGPT. The computational power required is eye-watering. Check out this open-source GPT2 bot that uses a decentralized network of many people's GPUs. I don't know how many GPUs are on the network exactly, but it's more than 8, and look how slow it is. Remember this is only GPT2 not GPT3 like ChatGPT.


ianitic t1_j9wzh8z wrote

That's also just for inference and fine tuning. Even more processing power is required for a full training of the model.


_Bl4ze t1_j9wkpgg wrote

Yeah, but it would probably be way faster than that if only 8 people were using that network at a time!


JohnLawsCarriage t1_j9xchqo wrote

Oh shit, I just found out how many GPUs they used to train this model here. 288 A100 80GB NVIDIA Tensor core GPUs.