blueSGL t1_jb6h9jc wrote on March 6, 2023 at 8:15 PM

Reply to comment by CertainMiddle2382 in What might slow this down? by Beautiful-Cancel6235

>Seeing the large variance in the hardware cost/performance of current models, Id think the progression margin for software optimization alone is huge.

>I believe we already have the hardware required for one ASI.

Yep, how many computational "ah-ha" moment tricks are we away from running much better models on the same hardware.

Look at stable diffusion how the memory requirement fell through the floor. We already are seeing similar with LLaMA now getting into public hands (via links from pull requests on Facesbooks github lol) there are already tricks getting implemented in front ends for LLMs that allow for lower VRAM usage.

Baturinsky t1_jb6v5pm wrote on March 6, 2023 at 9:44 PM

I haven't noticed any improvement in memory requirements for 5 months on Stable Diffusion... My RTX2060 still is enough for 1024x640, but not more.

LLaMA does good on tests on small models, but small size could make it not as fit for RLHF.

There is also miniaturisation for inference by reducing precision to int8 or even4. But that does not fit for training, and I believe AGI requires real-time training.

So, in theory, AGI could be achieved even without big "a-ha"-s. Take existing training methods, train on many different domains and data architectures, add tree earch from AlphaGo and real time training - and we probably will be close. But it would require pretty big hardware. And would be "only" superhuman in some specific domains.