Comments

You must log in or register to comment.

Neurogence t1_j7iq0bk wrote

I think it's wise. Everyone is focusing on the LLM's. It's not good to put all your eggs in one basket.

15

SoylentRox t1_j7j08r9 wrote

I don't think he'll succeed but for a very lame reason.

He's likely right that the answer won't be in solely transformers. However, the obvious way to find the right answer involves absurd scale:

(1) thousands of people make a large benchmark of test environments (many resembling games) and a library of primitives by reading every paper on AI and implementing the ideas as composible primitives.

(2) billions of dollars of compute are spent to run millions of AGI candidates - at different levels of integration - against the test bench in 1.

This effort would consider millions of possibilities - in a year or 2, more possibilities for AGI than all work done by humans so far. And it would be recursive - these searches aren't blind, they are being done by the best scoring AGI candidates who are tasked with finding an even better one.

​

So the reason he won't succeed is he doesn't have $100 billion to spend.

20

Imaginary_Ad307 t1_j7j52le wrote

PhD Penti O. Haikonen has also a different approach and has showed interesting results with a very cheap architecture. However his solution requires hardware neural networks, according to him doesn't work on software neural networks.

1

yeaman1111 t1_j7jj8mq wrote

Thanks, great read. Carmack's an interesting guy; he's humble, but also not shy about his qualities and what he brings to the game. Its good we have people exploring alternative pathways to AI that are not in the same billion dollar tech giant wheelhouse. Seems almost cyberpunk.

8

beambot t1_j7jq4qa wrote

Anyone have a list of the 40(ish) ML papers he was recommended...?

7

No_Ninja3309_NoNoYes t1_j7kblef wrote

10K LoC? Sure if someone writes hundreds of supporting toolkits for that first. My friend Fred says that the pseudo code for better LLMs is just a few lines:

  1. Use AI to generate candidate rules like P(People eat sandwiches) >> P(Sandwiches eat people)
  2. Hire lots of humans. Get them to process the data from 1 and produce rules like P(Sandwiches eat people) = 0.
  3. Feed the rules to the AI of rule 1

So let's say that you need one cent for each rule for a total of billion rules. With a thousand workers each producing 100K rules a year... It's doable for a billionaire. And you need seven similar schemes for other types of data. However I think AGI is not feasible in a decade. The hardware, software, data, and algorithms are not ready yet.

1

dasnihil t1_j7kvu8z wrote

we just need a team of a few math wizards to come up with better algorithms for training, matrix multiplications and whatever np problems are there in meta learning.. oh wait! we can just throw all our data into current AI and they will come up with the algorithms!!

this is how AGI will be achieved, there is no other way because humans don't get too many emmy noethers to come up with some new ways to do math. humans are busy with their short life and various indulgence.

3

dasnihil t1_j7kvz2l wrote

and we need this to cut that cost from $100bn to potato because biology runs on potato hardware, not a $100bn super computer. only if these pseudonerds realized it in the AI industry, we'd be expediting our search for more optimally converging networks.

1

megadonkeyx t1_j7l3goa wrote

LLMs seem to be a top-down "reverse pipeline" method, form the intelligence from the interconnections of peoples intelligence through language.

it seems that jc is advocating more a classic bottom-up approach. ie create an artificial insect then mouse type brain and build up small modules.

the thing that stands out here is that it all seems to be with classical computer hardware and not some radical new hardware.

1

SoylentRox t1_j7lb053 wrote

Pretty much. It's also that those math wizards may be smarter than current AI but they often duplicate work. And it's an iterative process - AI starts with what we know, tries some things very rapidly. A few hours later it has the results and tries some more things based on that and so on.

Those math wizards need to publish and then read what others published. Even with rapid publishing like Deepmind does to a blog - they do this because academic publications take too long - it's a few months between cycles.

2