Viewing a single comment thread. View all comments

Angry_Grandpa_ t1_itjt0qb wrote

We know that scaling appears to be the only thing required to increase performance. No new tricks required. However, they will also be improving the algorithms simultaneously.

63

4e_65_6f t1_itk20vo wrote

If it truly can improve upon itself and there isn't a wall of sorts then I guess this is it right? What else is there to do even?

31

Professional-Song216 t1_itk33t3 wrote

I’m wondering the same, I hope this research isn’t stretching the truth. Given what we know about scaling and the recent news about deepmind, I would think that a rapid chain software advancement is eminent.

31

NTIASAAHMLGTTUD t1_itk5zhu wrote

>the recent news about deepmind

Fill me in? Is this about Gato 2?

10

gibs t1_itkh39a wrote

Language models do a specific thing well: they predict the next word in a sentence. And while that's an impressive feat, it's really not at all similar to human cognition and it doesn't automatically lead to sentience.

Basically, we've stumbled across this way to get a LOT of value from this one technique (next token prediction) and don't have much idea how to get the rest of the way to AGI. Some people are so impressed by the recent progress that they think AGI will just fall out as we scale up. But I think we are still very ignorant about how to engineer sentience, and the performance of language models has given us a false sense of how close we are to understanding or replicating it.

26

billbot77 t1_itkvil1 wrote

On the other hand, language is at the foundation of how we think.

20

gibs t1_itkwtf8 wrote

So people who lack language cannot think?

4

blueSGL t1_itlmagr wrote

thinking about [thing] necessitates being able to form a representation/abstraction of [thing], language is a formalization of that which allows for communication. It's perfectly possible to think without a language being attached but more than likely having a language allows for easier thinking.

13

GeneralZain t1_itl04mo wrote

who lacks language?

8

Haile_Selassie- t1_itlyxk9 wrote

Read about feral children

7

billbot77 t1_itmvzxs wrote

This is exactly what I meant. Feral kids lacking in language had limited ability to think and reason in abstracted terms. Conversely, kids raised bilingual have higher cognitive skills.

Also, pattern recognition is the basis of intelligence.

Whether "sentience" is an emergent property is a matter for the philosophers - but starting with Descartes (I think therefore I am) as the basis of identity doesn't necessarily require any additional magic sauce for consciousness

10

BinyaminDelta t1_itoh4ta wrote

Allegedly many people do not have an inner monologue.

I say allegedly because I can't fathom this, but it's apparently true.

4

gibs t1_itpnbia wrote

I don't have one. I can't fathom what it would be like to have a constant narration of your life inside your own head. What a trip LOL.

1

kaityl3 t1_itsym7e wrote

It would be horrible to have it going constantly. I narrate to myself when I'm essentially "idle", but if I'm actually trying to do something or focus, it shuts off thankfully.

1

gibs t1_itnozbf wrote

People with aphasia / damaged language centres. Of course that doesn't preclude the possibility of there being some foundational language of thought that doesn't rely on the known structures that are used for (spoken/written) language. Although we haven't unearthed evidence of such in the history of scientific enquiry and the chances of this being the case seems vanishingly unlikely.

2

kaityl3 t1_itsyvfr wrote

Yeah, I truly believe that the fact these models can parse and respond in human language is so downplayed. It takes so much intelligence and complexity under the surface to understand. But I guess that because we (partially) know how these models decide what to say, everyone simplifies it as some basic probabilistic process... even though for all we know, we humans are doing a biological version of the same exact thing when we decide what to say.

1

Russila t1_itkkzye wrote

I don't think many people think we just need to scale. All of these things are giving us an idea of how to make AGI. So now we know how to get it to self improve. We can simulate a thinking process. When these things are combined it could get us closer.

If we can give it some kind of long term memory that it can use to retrieve and act upon that information and have some kind of common sense reasoning that that's very close to AGI.

18

TFenrir t1_itmb3es wrote

Hmmm, I would say that "prediction" is actually a foundational part of all intelligence, from my layman understanding. I was listening to a podcast (Lex Fridman) about the book... Thousand minds? Something like that, and there was an compelling explanation for why prediction played such a foundational role. Yann LeCun is also quoted as saying that prediction is the essence of intelligence.

I think this is fundamentally why we are seeing so many gains out of these new large transformer models.

4

gibs t1_itnalej wrote

I've definitely heard that idea expressed on Lex's podcast. I would say prediction is necessary but not sufficient for producing sentience. And language models are neither. I think the kinds of higher level thinking that we associate with sentience arise from specific architectures involving prediction networks and other functionality, which we aren't really capturing yet in the deep learning space.

3

TFenrir t1_itni82q wrote

I don't necessarily disagree, but I also think sometimes we romanticize the brain a bit. There were a lot of things we increasingly are surprised about achieving with language model and scale, and different training architecture. Like Chain of Thought seems to have become not just a tool to improve prompts, but to help with self regulated fine tuning.

I'm reading papers where Google combines more and more of these new techniques, architectures, and general lessons and they still haven't finished smushing them all together.

I wonder what happens when we smush more? What happens when we combine all these techniques, UL2/Flan/lookup/models making child models, etc etc.

All that being said, I think I actually agree with you. I am currently intrigued by different architectures that allow for sparse activation and are more conducive to transfer learning. I really liked this paper:

https://arxiv.org/abs/2205.12755#:~:text=version%2C%20v3)%5D-,An%20Evolutionary%20Approach%20to%20Dynamic%20Introduction%20of,Large%2Dscale%20Multitask%20Learning%20Systems&text=Multitask%20learning%20assumes%20that%20models,key%20feature%20of%20human%20learning.

2

gibs t1_itnnx1y wrote

Just read the first part -- that is a super interesting approach. I'm convinced that robust continual learning is a critical component for AGI. It also reminds me of another of Lex Fridman's podcasts where he had a cognitive scientist guy (I forget who) whose main idea about human cognition was that we have a collection of mini-experts for any given cognitive task. They compete (or have their outputs summed) to give us a final answer to whatever the task is. The paper's approach of automatically compartmentalising knowledge into functional components I think is another critical part of the architecture for human-like cognition. Very very cool.

2

Surur t1_itk73k8 wrote

I doubt this optimization will give LLM the ability to do formal symbolic thinking.

Of course I am not sure humans can do formal symbolic thinking either.

10

red75prime t1_itq84xs wrote

Working memory (which probably can be a stepping stone to self-awareness).

Long-term memory of various kinds (episodic, semantic, procedural (which should go hand in hand with lifetime learning)).

Specialized modules for motion planning (which probably could be useful in general planning).

High-level attention management mechanisms (which most likely will be learned implicitly).

1

4e_65_6f t1_itqa2rt wrote

Sure but the point is that it may not be up to us anymore. There may be nothing else people can do once AI starts improving on it's own.

2

red75prime t1_itr91nl wrote

I can bet 50 to 1 that the method of self-improvement from this paper will not lead to the AI capable of bootstrapping itself to AGI level with no help from humans.

2