zesterer

zesterer t1_j95owhm wrote

There's nothing in your example that demonstrates actual reasoning: as I say, GPT-3's training corpus is enormous, larger than a human can reasonably comprehend. Its training process was incredibly good at identifying and extracting patterns within that data set and encoding them into the network.

Although the example you gave is 'novel' in the most basic sense, there's no one part of it that is novel: Bing is no more reasoning about the problem here than a student is that searches for lots of similar problems on Stack Overflow and glues solutions together. Sure, the final product of the student's work is "novel", as is the problem statement, but that doesn't mean that the student's path to the solution required intrinsic understanding of that process when such a vast corpus is available to borrow from.

That's the problem here: the corpus. GPT-3 has generalised the training data it has been given extremely well, there's no doubt about that - so much so that it's even able to solve tasks that are 'novel' in the large - but it's still limited by the domains covered by the corpus. If you ask it about new science or try to explain to it new kinds of mathematics, or even just give it non-trivial examples of new programming languages, it fails to generalise to these tasks. I've been trying for a while to get it to understand my own programming language, but it constantly reverts back to knowledge it has from its corpus, because what I'm asking it to do does not appear within its corpus, either explicitly or implicitly as a product of inference.

> ... you actually believe only biological minds are capable of reasoning

Of course not, and this is a strawman. There's nothing inherent about biology that could not be replicated digitally with enough care and attention.

My argument is that GPT-3 specifically is not showing signs of anything that could be construed as higher-level intelligence, and that its behaviours - as genuinely impressive as they are - can be explained by the size of the corpus it was trained on, and that - as human users - we are - misinterpreting what we're seeing as intelligence when it is in fact just a statically adept copy-cat machine with the ability to interpolate knowledge from its corpus to cover domains that are only implicitly present in said corpus such as the 'novel' problem you gave as an example.

I hope that clarifies my position.

1

zesterer t1_j95bc0m wrote

Meme

With respect, the fact that it's found more abstract ways to identify patterns between tokens beyond "these appeared close to one another in the corpus" doesn't imply that it's actually reasoning about what it's saying, nor that it has an understanding of semantics. It's worth remembering that it's had a truly enormous corpus to train on, many orders of magnitude greater than that which human beings are exposed to: it's observed almost every possible form of text, almost every form of propose, and it's observed countless relationships between text segments that have allowed it to form a pretty impressive understanding of how words relate to one-another.

Crucially, however, this does not mean that it is meaningfully closer to truly understanding the world than past LLMs or even chat bots more widely. It's really important to take that part of your brain that's really good at recognising when you're talking to a person and put it in a box when talking to these systems: it's not a useful way to intuit what the system is actually doing because, for hundreds of thousands of years, the only training data your brain has had has been other humans. We've learned to treat anything that can string words together in a manner that seems superficially coherent as possessing intrinsic human-like qualities, but now we're faced with a non-human that has this skill and it's broken our ability to think about what they are.

I think a fun example of this is Markov models. Broadly speaking, they're a statical model built up by scanning through a corpus and deriving probabilities for the chance that certain words follow certain other words. Take 1 word of context, and a small corpus, and the output they'll give you is pretty miserable. But jump up to a second or third order markov model (i.e: 2-3 words of context) with a larger corpus and very suddenly they go from incoherent babble to something that seems human-like at a very brief glance. Despite this fact, the reasoning performed by the model has not changed: all that's happened is that it's gotten substantially better at identifying patterns in the text and using the probabilities derived from the corpus to come up with outputs.

GPT-3 is not a markov model, but it is still just a statistical model and its got a context of 4,096 tokens, a corpus many orders of magnitude larger than even the most data the most well-read of us are ever exposed to over our entire lives, and it's got an enormous capacity to identify relationships between these abstract tokens. Is it any wonder that it's extremely good at fooling humans? And yet, again, there is no actual reasoning going on here. It's the Chinese Room problem all over again.

2