ECEngineeringBE

ECEngineeringBE t1_iwq8f8j wrote

>Static. Deterministic. Unchanging. Such a thing can never be an agent, and thus can never be a true AGI

It can deterministically output probability distributions, which you can then sample, making it nondeterministic. You also say that such a system can never be an agent. A chess engine is an agent. Anything that has a goal and acts in an environment to achieve it is an agent, whether deterministic or not.

But even a fully deterministic program can be an AGI. If you deny this, then this turns into a philosophical debate on determinism, which I'd rather avoid.

As for "static" and "unchanging" points - you can address those by continual learning, although that's not the only way you can do it.

There are some other points you make, but those are again simply doing the whole "current models are bad at X, therefore current methods can't achieve X".

I also think that you might be pattern matching a lot to GPT specifically. There are other interesting DL approaches that look nothing like the next token prediction.

Now, I think we ought to narrow down our disagreements here, as to avoid pointless arguments. So let me ask a concrete question:

Do you believe that a computer program - a code being run on a computer, can be generally intelligent?

1

ECEngineeringBE t1_iwq1ju9 wrote

At first, I was going to write a comment that went through and addressed every single one of your points. A couple of them are factually wrong, some are confused, but a lot of the other ones boil down to pointing out how current systems are bad at X, therefore deep learning is never going to be able to do X.

This is why I decided to take a bit more general approach and not stray too far away from the original purpose of my comment. It is not my purpose to convince you that deep learning will achieve AGI, but rather, that you can't claim with certainty that it won't.

We have already seen that larger models end up with certain emergent capabilities not present in smaller models, so finding faults in current ones is not sufficient for dismissing the method entirely. Especially because our largest models are still way too tiny in comparison to the human brain - a brain has ~150T synapses (I know that parameters aren't the same as biological synapses, but I'm pointing out the order of magnitude).

Additionally, matrix multiplications with nonlinear activations are Turing complete. This means that there exists a set of weights that would create an AGI. The question then becomes, not whether you could build an AGI with NNs, but rather, whether backprop, as a program search algorithm, is capable of finding that set of weights. And claiming that you know for certain is the same as claiming that you intuitively understand how a 100T dimensional search space looks, and what backprop with regularization is actually doing. Considering the amount of papers that keep coming out and pointing out some unexpected behaviors of backprop, it is safe to say that nobody fully understands what it's actually doing.

My point, more generally, can be summarized like this:

In any field, if there is a certain percentage of experts (say 10% or more) that hold an opinion X, and you can't either formally, or empirically prove that X is not true, then you can't claim with complete certainty that X is not true.

Now, some of the confused or factually incorrect statements from your comment:

>For example, having an AI that learns over time is impossible.

Not true, there are various approaches to doing continual learning, such as this one:

https://arxiv.org/abs/2108.06325

>Every model I've witnessed so far has just been static input->output machines

Every system can be expressed as an input->output system - that's what Turing machines are for.

>No amount of cramming data or expanding the models will ever result in an AI that can learn new tasks given some simple instructions and then immediately perform them competently like a human would

I've actually done this. You can do this via prompt engineering. For example, I created a prompt where I add two 8 digit numbers together (written in a particular way) in a stepwise digit by digit fashion, and explain my every step to the model in plain language. I then ask it to add different two numbers together, and it begins generating the same explanation of digit by digit addition, and finally arriving at the correct answer.

>LLMs no matter their size suffer from the exact same problem and it's clear as soon as you "ask" it something that's outside of the dataset

You do realize that test sets don't contain data from within the dataset, and that the accuracy on them is not zero?

1

ECEngineeringBE t1_iwpov33 wrote

Current approach as in autoregressive next token text prediction? Any next token text prediction in general, even multimodal? Or current approach as in entire field of deep learning?

Could you please first specify what you mean by "current approach" and "rework" exactly? In my mind, it doesn't particularly matter if some approach needs a rework if that rework is easily implementable. So I think that you should first kind of expand on the point you're making so that we can discuss it.

1

ECEngineeringBE t1_iw336q1 wrote

I'm glad that we could come to see eye to eye on this one. Though, I personally didn't find the article to be spreading the "AGI is here!" type of hype. They even said that the Turing test is considered outdated in the article. The article did hype me up, but more in a "holy shit let's see what sort of capabilities it'll have" type of way, and these models can be used to help on all sorts of projects, so they have utility.

I personally slightly disagree with those timelines, but since you said that it's your opinion, I don't have any issues with that. Of course, we could go into actually discussing our personal opinions, but that would be a bit steering away from the purpose of my original comment, so I think that we can leave it at that. Cheers!

3

ECEngineeringBE t1_iw2wcjj wrote

>because the fundamental architecture for modern machine learning will not get us to AGI

This is what I'm talking about when I say that you're stating your opinions as if they are a fact. You can't reasonably have that level of epistemological certainty about topics like these.

There is a significant number of experts that precisely believe that we don't need a new AI paradigm, and that continuing research in our current direction will lead to AGI. Are they all stupid and delusional? No, they are not. Could they be wrong? Sure, they could. My point is that when you talk about these topics to people who don't know much about them, and you use your authority as an expert, without actually separating which parts are opinions and which are facts, they are going to believe that all of it is a well established fact.

>A lot of the smartest people in the field think we're 100 years to never away

Yes, and a lot also don't. Which is my point.

10

ECEngineeringBE t1_iw28jgu wrote

I hate how you use the fact that you're in the field of AI to give an expert opinion on a subject, but aren't honest enough to point out that there is a huge amount of disagreements on timelines and approaches among experts. You make it seem as if your opinion is shared among every single expert working on AI today, even though a huge number of them have 10-40 year timelines.

32

ECEngineeringBE t1_irahdyd wrote

Sure, but just because you can't replicate it, doesn't mean that nobody can. We already had Facebook's paper on video generation a week ago, and we also have stability AI saying that they're planning their own model.

And also, just because the results can't be fully trusted (due to high barrier of replicability), does not mean that the publication isn't "research".

5