pier4r t1_iyzl5ta wrote on December 5, 2022 at 10:47 AM

I not too deep in ML , but I read articles every now and then (especially about hyped models, GPT and co). I see that there is progress on some amazing things (like GPT-3.5) also because their NN gets bigger and bigger.

My question is: are there studies that check that NN could do more (are more precise or whatever) given the same parameters? In other words, it is a race in making NN as large as possible (given that they are structured appropriately) or is the "utility" per parameter also growing? I would like to know if there is literature about it.

It is a bit like an optimization question. "Do more with the same HW" so to speak.