LesleyFair OP t1_j501xt6 wrote on January 19, 2023 at 1:38 PM

Reply to comment by --dany-- in GPT-4 Will Be 500x Smaller Than People Think - Here Is Why by LesleyFair

First, thanks a lot for reading and thank you for the good questions:

A1) Current GPT-3 is 175B parameters. If GPT-4 would be 100T parameters, it would be a scale-up of roughly 500x.

A2) I got the calculation from the paper for the Turing NLG model. The total training time in seconds is reached by multiplying the number of tokens by the number of model parameters. That number is then divided by the number of GPUs times each GPU's FLOPs per second.

adubowski t1_j549298 wrote on January 20, 2023 at 7:41 AM

Is your assumption that GPT-4 will stay the same size as GPT-3?