Comments

You must log in or register to comment.

ghostfaceschiller t1_jdfc5uj wrote

Leaving out the best part: a commented out line reveals that the original/alternate title of the paper was “First Contact With An AGI System”

108

currentscurrents t1_jdft0hp wrote

>They seem to refer to this model as text-only, contradicting to the known fact that GPT-4 is multi-modal.

I noticed this in the original paper as well.

This probably means that they implemented multimodality the same way Palm-E did; starting with a pretrained LLM.

57

Maleficent_Refuse_11 t1_jdgmml7 wrote

I get that people are excited, but nobody with a basic understanding of how transformers work should give room to this. The problem is not just that it is auto-regressive/doesn't have an external knowledge hub. At best it can recreate latent patterns in the training data. There is no element of critique and no element of creativity. There is no theory of mind, there is just a reproduction of what people said, when prompted regarding how other people feel. Still, get the excitement. Am excited, too. But hype hurts the industry.

36

Miserable_Movie_4358 t1_jdgqcy4 wrote

if you follow the argumentation line this person is referring to the model described on the published paper. In addition to that I invite you to investigate what knowledge means (Ps is not having access only to data)

5

Jean-Porte t1_jdgs7u3 wrote

Isn't Davinci-3 GPT3 ? Is GPT-4 GPT3 trained much longer ?

1

Necessary-Meringue-1 t1_jdgwjo8 wrote

That's true, but the outputs it produces are eerily persuasive. I'm firmly in the "LLMS are impressive but not AGI" camp. Still, the way it used Java to draw a picture in the style of Kandinsky blew me away. Obviously, a text2image model would be able to do that. But here they prompted GPT-4 to generate code that would generate a picture in a specific style. Which requires an extra level of abstraction and I can't really understand how that came about given that you would not expect a task like this in the training data. (page 12 for reference: https://arxiv.org/pdf/2303.12712.pdf)

I agree that a transformer really should not be considered "intelligent" or AGI, but LLMs really have an uncanny ability to generate output that looks "intelligent". Granted, that's what we built them to do, but still.

32

anothererrta t1_jdgx8pd wrote

There is no point arguing with the "it just predicts next word" crowd. They only look at what an LLM technically does, and while they are of course technically correct, they completely ignore emergent capabilities, speed of progress and societal impact.

The next discussion to have is not whether we have achieved early stages of AGI, but whether it matters. As long as we're not pretending that a system is sentient (which is a separate issue from whether it has AGI properties) it ultimately doesn't matter how it reliably solves a multitude of problems as if it had general intelligence; it only matters that it does.

48

omgpop t1_jdgz4xl wrote

If I understand correctly, the model is optimised to effectively predict the next word. That says nothing of its internal representations or lack thereof. It could well be forming internal representations as an efficient strategy to predict the next word. As Sam Altman pointed out, we’re optimised to reproduce and nothing else, yet look at the complexity of living organisms.

EDIT: Just to add, it’s not quite the same thing, but another way of thinking of “most probable next word” is “word that a person would be most likely to write next” (assuming the training data is based on human writings). One way to get really good at approximating what a human would likely write given certain information would be to actually approximate human cognitive structures internally.

28

paulgavrikov t1_jdgzmlj wrote

This a good reminder for everyone to delete comments from tex files before uploading to arxiv ... especially, if they were not meant to be public.

43

Econophysicist1 t1_jdh6fac wrote

Right, emergent properties are the key and they cannot be predicted from what NLM are supposed to do or how they work, this why they are emergent. The only way to find out what properties well trained NLM have is to test experimentally as this paper did and other papers that are doing the same, as this one:
https://arxiv.org/abs/2302.02083#:~:text=Theory%20of%20Mind%20May%20Have%20Spontaneously%20Emerged%20in%20Large%20Language%20Models,-Michal%20Kosinski

15

nerdimite t1_jdh7u44 wrote

Regardless of whether this is AGI or not seems irrelevant as long as it can demonstrate the capabilities or simulate intelligent behaviour. Also what is AGI if not "artificial" intelligence not real or true intelligence per se. We are trying to compare human intelligence with AI. But if two things demonstrate similar intelligent properties regardless of how, it can still be called sorta intelligent. Intelligence itself is a very subjective and philosophical term. At this point in technology, my opinion is that it shouldn't matter what and what is not AGI coz there's no way to measure that right now that everyone agrees on, as long as it demonstrates some form of "artificial" intelligence.

1

CptTombstone t1_jdhkpsp wrote

>At best it can recreate latent patterns in the training data.

Have you actually read the paper? On fresh LeetCode tests, GPT-4 significantly outperforms humans on all difficulty questions, reaching nearly double the performance of LeetCode users on medium and hard questions. Those are tests that were recently added to LeetCode's database and were not in the training data. Also, It performs genuinely well with image generation through SVG-code. The 3D modelling in Javascript example (Figure 2.7) is way out of the domain of what you would expect from "just a transformer", it demonstrates real understanding outside of the domain of the training data. It even outperforms purposely trained image generation models like stable diffusion in some regards, namely the adherence to instructions, although the generated images are not that visually pleasing compared to the likes of Dall-E of stable diffusion, which is a very unfair complaint for a freaking Language Model.

49

stimulatedecho t1_jdhlv4w wrote

>> nobody with a basic understanding of how transformers work should give room to this

I find this take to be incredibly naive. We know that incredible (and very likely fundamentally unpredictable) complexity can arise from simple computational rules. We have no idea how the gap is bridged from a neuron to the human mind, but here we are.

>> There is no element of critique and no element of creativity. There is no theory of mind, there is just a reproduction of what people said, when prompted regarding how other people feel.

Neither you, nor anybody else has any idea what is going on, and all the statements of certainty leave me shaking my head.

The only thing we know for certain is that the behavioral complexity of these models is starting to increase almost exponentially. We have no idea what the associated internal states may or may not represent.

60

agent_zoso t1_jdhovl9 wrote

Furthermore, if we are to assume that an LLM can be boiled down to nothing more than a statistical word probability engine because that's what its goal is (which is dubious for the same reason we don't think of people with jobs as being only defined as payraise probability engines, what if a client asks a salesman important questions unrelated to the salesman's goal, etc.), this point of view is self-destructive and completely incoherent when you factor in that for ChatGPT in particular, it's also trained using RLHF ("Reinforcement Learning with Human Feedback").

Everytime you leave a Like/Dislike (or take the time to write out longer feedback) on one of ChatGPT's messages, that gets used directly by ChatGPT to train the model through a never-ending process of (simulated) evolution through model competition with permutations of itself. So there are two things to note here, A. It's goals include not only maximizing log-likelihoods of word sequences but also in inferring new goals from whatever vague feedback you've provided it, and B. How can anyone be so sure that such a system couldn't develop sophisticated complexity like sentience or consciousness like humans did through evolution (especially when such a system is capable of creating its own goals/heuristics and we aren't sure how many layers of abstraction with which it's recursively doing so)?

On that second point in particular, we just don't currently have the philosophical tools to make any sort of statements about that, but people are sticking to hard-and-fast black and white statements of the kind we made about even other humans until recent history. We as humans love to have hard answers about others' opinions so I see the motivation for wanting to tamp down the tendency to infer emotion from ChatGPT's responses, but this camp has gone full swing in the other direction with unscientific and self-inconsistent arguments because they've read a buzzfeed or verge article produced by people with skin in the game (long/short msft, it's in everyone's retirement account too).

I think the best reply in general to someone taking the paperclip-maximizer stance while claiming to know better than everyone else the intricacies of an LLM's latent representations of concepts encoded through the linear algebraic matrix multiplication in the V space, the eigenvector (Q,K) embeddings from PCA or BERT-like systems, or embedded in its separate neuromorphic structure ("it's just autocorrect, bro") is to draw the same analogy that they're just a human meat-puppet designed to maximize dopamine and therefore merely a mechanical automaton slave to biological impulses. Obviously this reductionism is in general a fallacious way of rationalizing things (something we "forget" time and again throughout history because this time it's different), but you also can't counter by outright stating that ChatGPT is sentient/conscious/whatever, we don't know for sure whether that's even possible (cf. Chinese room -against, David Chalmers' Brain of Theseus -for, Penrose's contentious Gödelian construction demonstrating human supremacy as Turing machine halt checkers -against).

8

pmirallesr t1_jdi0eqg wrote

With these people, it's interesting to ask, how do we know human intelect is not.emergent behaviour of a simple task. That would correspond to a radical view of predictive coding. I'm no expert in neuroscience, but to me, the idea that AGI cannot arise from a single simple task makes less and less sense as time goes by

5

man_im_rarted t1_jdi7jyx wrote

I get that people are excited, but nobody with a basic understanding of how evolutionary biology works should give room to this. The problem is not just that it is IGF optimizing/blind hill climbing. At best it can randomly stumble onto useful patterns. There is no element of critique and no element of creativity. There is no theory of mind, there is just a stochastic reproduction maximizing, when prompted they just respond with what maximizes their odds of reproducing. Still, get the excitement. Am excited, too. But hype hurts the industry.

12

ReadSeparate t1_jdi9wic wrote

What if some of the latent patterns in the training data that it's recreating are those that underlie creativity, critique, and theory of mind? Why are people so afraid of the idea that both of these things can be true? It's just re-creating patterns from its training data, and an emergent property from doing that at scale is a form of real intelligence because that's the best way to do it, because intelligence is how those patterns originated from in the first place.

3

Username2upTo20chars t1_jdiesqt wrote

>But here they prompted GPT-4 to generate code that would generate a picture in a specific style.

5 seconds of googling "code which generates random images in the style of the painter Kandinsky":

http://www.cad.zju.edu.cn/home/jhyu/Papers/LeonardoKandinsky.pdf

https://github.com/henrywoody/kandinsky-bot

GPT's trained on the whole of the WWW sensible text are just sophisticated echo/recombination chambers. True, it works far better than most would have predicted, but that doesn't change the way they work. I am also impressed, but GPT-3 got known for parroting content, why should the next generation be fundamentally different? It just gets harder and harder to verify.

Nevertheless I even expect such generative models to be good enough to become very general. Most human work isn't doing novel things either. Just copying up to smart recombination.

17

inglandation t1_jdij4o8 wrote

> why should the next generation be fundamentally different?

Emergent abilities from scale are the reason. There are many examples of that in nature and many fields of study. The patterns of snowflakes cannot easily be explained by the fundamental properties of water. You need enough water molecules in the right conditions to create the patterns of snowflakes. I suspect that a similar phenomenon is happening with LLMs, but we haven't figured out yet what the patterns are and what are the right conditions for them to materialize.

10

inglandation t1_jdijeu5 wrote

> One way to get really good at approximating what a human would likely write given certain information would be to actually approximate human cognitive structures internally.

Yes, I hope that we'll be able to figure out what those structures are, in LLMs and in humans. It could also help us figure out how to align those models better if we can create more precise comparisons.

6

mescalelf t1_jdin2y7 wrote

Thank you for mentioning Microsoft’s (and MA investors’) role in this/their “skin in the game”. I’m glad to hear I’m not the only one who thought the press in question—and resulting popular rhetoric—seemed pretty contrived.

3

ShadoWolf t1_jdipal4 wrote

Some of the capabilities of ChatGPT4 .. are spooky. I mean GPT-4 hired someone off of TaskRabbit to solve a Captcha for it in test phases (https://cdn.openai.com/papers/gpt-4.pdf) . I don't think it's at AGI .. but it sort of feels like Where ever we are on the S curve for this technology. we finally on the same continent for getting to AGI

And some of the stuff people are getting to to do using LangChain with ChatGPT is crazy

2

agent_zoso t1_jdiwkmj wrote

It always is. If you want to get really freaky with it, just look at how NFTs became demonized at the same time as when Gamestop's pivot to NFT third-party provider was leaked by WSJ. Just the other month people were bashing the author of Terminal Shock and hard sci-fi cyberpunk pioneer Neal Stephenson in his AMA for having a NFT project/tech demo by arguing with someone that knows 1000x more than they do, saying it's just a CO2 emitter and only scam artists use it and were disappointed to see he'd try to do this to his followers. Of course, the tech has evolved and those claims weren't true in his case, but it was literally all in one ear out the other for these people even after he'd defend himself with the actual facts about his green implementation and how it works. They bought an overly general narrative and they're sticking to it!

Interesting that now, with a technology that produces an order of magnitude more pollution (you can actually list models on Hugging Face by the metric tonnes of CO2 equivalent released during training) and producing an epidemic of cheaters in high schools, universities, and the work force, it's all radio silence. God only knows how much scamming and propaganda (which is just scamming but "too big to fail") is waiting in the wings.

I don't think the average person even knows what they would do with such a powerful LLM beyond having entertaining convos with it or having it write articles for them. Of course they see other people doing great things with it and not really any of the other ways it's being misused by degens right now, which goes back to an advantage in corporate propaganda.

2

Username2upTo20chars t1_jdj8a6k wrote

I don't disagree with the phenomena of emergence, it's just that it doesn't explain anything. It is one word for "I have no idea how it works" or better: its magic. The issue I have with that is that you are quick to hide behind that word, using it as an explanation, accepted as the emergence has become.

But in fact you can't model one bit with it, it has no predictive power and it kind of shuts down discussions.

So far I haven't seen any evidence (have you?) that LLMs aren't doing anything else but predicting the next token. Yes there are certain thresholds, where they do overcome the one or other weakness. But in the end they just predict the next token better ... and even better. Impressive what you can do with that (chinese room like), but that doesn't imply that GPT4 is any different than GPT3.5, it's just better.

But as I wrote, you can in theory replace most non-manual work with that somewhere down the line anyway. But no GPT will develop you some ground-breaking Deep Learning architecture or solve important physics problems which need actual thought and not just more compute or...

Not that you claimed that - I do here -, but should GPT-7 or so suddenly do that, then you can hold me to it.

10

Snoo58061 t1_jdjdy56 wrote

I like to call this positive agnosticism. I don't know and I'm positive nobody else does either.

Tho I lean towards the theory of mind camp. General intelligence shouldn't have to read the whole internet to be able to hold a conversation. The book in the Searle's Chinese Room is getting bigger.

8

Maleficent_Refuse_11 t1_jdji46u wrote

How do you quantify that? Is it the downdoots? Or is it the degree to which I'm willing to waste my time discussing shallow inputs? Or do you go by gut feeling?

On a serious note: Have you heard of brandolinis law? The asymmetry described there has been shifted by several magnitudes with generative "ai". Unless we are going to start to use the same models to argue with people (e.g. chatbot output) on the net, we will have to choose much more carefully what discussions we involve ourselves on, don't you think?

−5

E_Snap t1_jdjug2q wrote

That’s a magical requirement, dude. We as humans have to study for literal years on a nonstop feed of examples of other humans’ behavior in order to be a competent individual. Why are you saying that an AI shouldn’t have to go through that same kind of development? At least for them, it only has to happen once. With humans, every instance of the creature starts out flat out pants-on-head rtrdd.

5

inglandation t1_jdjvmqe wrote

> you can't model one bit with it, it has no predictive power and it kind of shuts down discussions.

For now yes, my statement is not very helpful. But this is a phenomenon that happens in other fields. In physics, waves or snowflakes are an emergent phenomenon, but you can still model them pretty well and make useful predictions about them. Life is another example. We understand life pretty well (yes there are aspects that we don't understand), but it's not clear how we go from organic compounds to living creatures. Put those molecules together in the right amount and in the right conditions for a long time, and they start developing the structures of life. How? We don't know yet, but it doesn't stop us from understanding life and describing it pretty well.

Here we don't really know what we're looking at yet, so it's more difficult. We should figure out what the structures emerging from the training are.

I don't disagree that LLMs "just" predict the next token, but there is an internal structure that will pick the right word that is not trivial. This structure is emergent. My hypothesis here is that understanding this structure will allow us to understand how the AI "thinks". It might also shed some light on how we think, as the human brain probably does something similar (but maybe not very similar). I'm not making any definitive statement, I don't think anyone can. But I don't think we can conclude that the model doesn't understand what it is doing based on the fact that it predicts the next token.

I think that the next decades will be about precisely describing what cognition/intelligence is, and in what conditions exactly it can appear.

7

Snoo58061 t1_jdjvybp wrote

I'm saying it's not the same kind of development and the results are different. A human works for a long time to grasp the letters and words at all, then extracts much more information from many orders of magnitude smaller data sets with weaker specific recall and much faster convergence for a given domain.

To be clear I think AGI is possible and that we've made a ton of progress, but I just don't think that scale is the only missing piece here.

1

laisko t1_jdjwll7 wrote

Inspired by the paper I downloaded a random SVG example file and asked Alpaca/LLaMA to make changes to the code so that it looked more like a human face.

After a couple failed attempts I added some (heavy) restrictions, and it presented me with this (left is original, right is alpaca/llama output): https://i.imgur.com/787tlCU.png. Found it rather amusing to be honest.

My final prompt was:

### Instruction: The SVG code provided below draws a green square with pink borders, an orange disk, a diagonal blue line, and some straight red lines. Your task is to modify the SVG code so that the output looks more like a human face. Don't add new stuff, use short and efficient code (don't use <polygon points/> or <path/> for starters), but be creative and have fun. The code MUST be short (max 112 words) and complete.

(Did it 'have fun'? Who knows!)

2

E_Snap t1_jdjwmkp wrote

Honestly, I have a very hard time believing that. Machine learning has had an almost trailblazing relationship with the neuroscience community for years now, and it’s pretty comical. The number of moments where neuroscientists discover a structure or pattern developed for machine learning years and years ago and and then finally admit “Oh yeah… I guess that is how we worked all along,” is too damn high to be mere coincidence.

3

Snoo58061 t1_jdjxmti wrote

The brain almost certainly doesn't use backpropgation. Liquid nets are a bit more like neurons than the current state of the art Most of this stuff is old theory refined with more compute and data.

These systems are hardly biologically plausible. Not that biological plausibility is a requirement for general intelligence.

3

DragonForg t1_jdkb8w9 wrote

This is fundamentally false. Here is why.

In order to prove something and then prove it incorrect you need distinct guidelines. Take gravity their are plenty of equations, plenty of experiments etc. We know what it looks like, mathematically what it is and so on. So if we take a computation version of gravity we have a reliable comparison method to do so. Someone can say this games gravity doesn't match with ours as we have distinct proofs for why it doesn't.

However what we are trying to prove/disprove is something we have 0 BAISIS ON. We barely understand the brain, or consciousness or why things emerge the way we do, we are no where near close enough to make strict definitions of theory of mind or creativity. The only comparison is if it mimics ours the most.

Stating it doesnt follow my version of theory of mind is ridiculous its the same as saying my God is real and yours isn't, your baises of why we have creativity is not based on a distinct proved definition but rather an interpretation of your experiences studying/learning it.

Basically our mind is a black box too, we only know what comes out not what happens inside. If both machine and human get the same output and the same input, it legitimately doesnt matter what happens inside. Until we either can PROVE how the brain works to exact definitions. Until then input and output data is sufficient enough for a proof otherwise AI will literally kill us because we keep obsessing over these definitive answers.

It's like saying nukes can't do this or that. Instead of focusing on the fact that nuclear weapon can destroy all of humanity. The power of these tools just like nuclear weapons shouldn't be understated because of semantics.

3

cyborgsnowflake t1_jdkruku wrote

We know the nuts and bolts of what is happening since it's been built from the ground up by humans. Gptx is essentially a fancy statistical machine. Just rules for shuffling data around to pick word x+ 1 on magnetic platters. No infrastructure for anything else. Let alone a brain. Unless you think adding enough if statements creates a soul. I'm baffled why people think gpt is sentient just because it can calculate solutions based on the hyperparameters of the knowledge corpus as well or better than people. Your Casio calculator or linear regression can calculate solutions better than people. Does that mean your Casio calculator or the x/y grid in your high school notebook is sentient?

0

cyborgsnowflake t1_jdl47n8 wrote

I think the simpler answer is its easier than some people believed to reproduce certain knowledge tasks statistically than the alternative theory that shuffling tensors creates living thinking beings like everyone else on this thread seems to be jumping on board.

2

agent_zoso t1_jdlgre2 wrote

The use of neural nets (ReLU + LayerNorms) layered between each attention step counts as a brain, no? I know the attention mechanism is what gets the most ... attention, but there's still traditional neural nets sandwiched between and in some cases the transformer is just a neck feeding into more traditional modules. ReLU is Turing complete so I can always tune a neural net to have the same response pattern of electrical outputs as any neuron in your brain.

The million dollar question according to David Chalmers is, would you agree that slowly replacing each neuron with a perfect copy one at a time will never cause you to go from completely fine to instantly blacked out? If you answered yes, then it can be shown (sections 3&4) that you must accept that neural nets can be conscious, since by contradiction if there was a gradual phasing out of conscious experience rather than sudden disappearance, that would necessarily require the artificial neurons to at some point begin behaving differently than the original neurons would (we would be aware of the dulling of our sensation).

Considering we lose brain cells all the time and don't get immediately knocked out, I think you can at least agree that most people would find these assumptions reasonable. It would be pretty weird to have such a drastic effect for such a small disturbance.

4

Western-Image7125 t1_jdnvnu7 wrote

Well your last line kinda makes the same point as the other person you are debating with? What if we are getting really close to actual intelligence, even though it is nothing like biological intelligence which is the only kind we know of

3

cyborgsnowflake t1_jdoz0ia wrote

In a very general sense neurons and nns are the same in that they are both networks but the brain from what we know is structured very differently to gpt which is a more or less simply a linear circuit for processing tensors. I'm not sure what the reasoning is to jump to the conclusion that a living being is popping into existence when you run GPT just because the output 'looks human'. You could just conclude that' knowledge tasks to a certain degree can be approximated statistically'. As anyone who watched horny men get fooled by chatbots in the 90s should know.

If you believe the former than logically if you replaced the computer circuits with humans than even people writing equations on paper together should if there was enough of them theoretically also cause these 'calculation beings' with minds independent of the humans themselves to pop into existence. Which maybe you can argue for under certain philosophies but thats veering off into territory far from orthodox computer science.

2

agent_zoso t1_jdpbe5o wrote

It sounds like you're pretty hard set on there being no ghost in the shell and pretty judgmental of anyone who thinks otherwise. I'm just saying you're far too certain you have the answers, as my example demonstrates. I also never said I believe a living being is jumping into existence because of whatever successful Turing test. I'm actually agnostic on that and think it's a waste of time trying to answer something that will never be answered. It's always going to come down to bickering over definitions and goalpost-shifting ("It can't be sentient if it doesn't model glial cells/recurrent cortical stacks/neurotransmitter densities/electrochemistry/quantum computational effects inside microtubules/the gut microbiome/the embedding within the environment/the entire rest of the universe like us"). I'd much rather play it safe and treat it as though it is conscious.

Maybe I'm misunderstanding you, but it sounds like you're now also being far too dismissive of the representational power tensors/linear operations and especially eigendecompositions can have (I could probably blow your mind with the examples I've seen), and of statistics as a loss function. After all, we as humans are literally no more than statistical mechanical partition functions of Von Neumann density matrices, what would you even use for a loss function instead? MSE, cross-entropy (perplexity), KL, L1/L2 are statistical and used to build every neural net you've heard about. The only difference between us and say a Turing-complete (nonlinear ReLU + attentional) Kalman filter for text like you're making GPT out to be is how the hyperparameters are adjusted. A Kalman filter uses Bayesian inference with either Laplace's method or maximum-likelihoodist rules, whereas we (and ChatGPT) are genetically rewarded for minimizing both cost (resp. perplexity) and nonlinear human feedback. Keep boiling things down and you'll find you're surrounded by philosophical zombies.

Edit: Didn't see the second paragraph you added. I'm not sure what ML orthodoxy you're from, but Chalmers' result is pretty well accepted in CogSci. The setup that you're describing, the Chinese room, is an appeal to common sense, but a lot of what motivates scientific development is trying to understand paradoxes and counter-intuitive results. Sure, it sounds absurd, but so does Schrödinger's cat or black holes, both of which failed to disprove the underlying phenomena. Chalmer's 1995 result came after the Chinese Room thought experiment (by about 15 years in fact) and updated the consensus since on the Chinese Room by stressing the importance of universality. Since your example has humans performing the computation, I would say it could be alive (depending on the complexity of the equations, are they reproducing the action potentials of a human brain?), and case in point I think the internet even before ChatGPT is the most likely and well-known contender for a mass of human scribbles being conscious.

1