spreadlove5683

spreadlove5683 t1_j01vqsq wrote

No, 500x parameters doesn't mean 500x more powerful at the very least because GPT3 was trained using incorrect scaling laws. They figured out since then that number of parameters wasn't the bottleneck. I forget if the bottleneck was data, or compute, but don't expect way higher parameter counts in GPT-4 if higher at all. I'm not an expert.

2