f_max

f_max t1_jbze2pl wrote

Speaking as someone working on scaling beyond gpt3 sizes, I think if there was proof of existence of human level ai at 100T parameters, then people would put down the money today to do it. It’s roughly $10m to train a 100B model. With rough scaling of cost with param size, it’s $10B to train this hypothetical 100T param ai. That’s the cost of buying a large tech startup. But a human level ai is probably worth more than all of big tech combined. The main thing stopping people is no one knows if the scaling curves will bend and we’ll hit a plateau in improvement with scale, so no one has the guts to put the money down.

5

f_max t1_j3eagrm wrote

Right. So if you’d rather not shoot to join a big company, there’s still work that can be done in academia with say a single A100. Might be a bit constrained at pushing the bleeding edge of capability. But there’s much to do to characterize LLMs. They’re black boxes we don’t understand in a bigger way than maybe any previous machine learning model.

Edit: there are also open source weights for gpt3 type models w similar performance. Ie huggingface BLOOM or Meta OPT.

3

f_max t1_j3e2s3m wrote

I work at one of the big techs doing research on this. Frankly LLMs will be the leading edge of the field for the next 2 years imo. Join one of the big techs and get access to tens of thousands of dollars of compute per week to train some LLMs. Or in academia, lots of work needs to be done to characterize inference-time capabilities, understand bias, failure modes, smaller scale experiments w/ architecture, etc.

14