Submitted by singularpanda t3_1060gfk in MachineLearning
suflaj t1_j3e1h8f wrote
Not by a long shot.
ChatGPT in practice is a politically-biased conversational Google and Wikipedia summarizer with a bit of polite talk. And it is less broad than both of them.
It is truly fascinating how DEEP it can go, ex. translating arbitrary code in almost correct assembly, even recent one like M1, but that's that. It cannot reason fully, it cannot extrapolate, and most importantly, it has fairly old training data to compete with the speed of NLP research.
But it's nifty to chat with if none of your colleagues have the time.
Freed4ever t1_j3e36am wrote
But that's the current state, we know there will be a v.next to infinity, no? Would there be a state where it can train itself, similar to how Deepmind trains itself in games?
suflaj t1_j3e3piv wrote
Based on the techniques ChatGPT uses we cannot formally prove that it can generalize without infinite width. Even our training process amounts to mostly teaching the model to compress knowledge. ChatGPT made some strides by partially introducing something similar to reinforcement learning, but reinforcement learning itself is not enough to extrapolate or come up with new concepts.
All the big names in AI claim that stochastic gradient descent techniques and our current direction are fascinating, but ultimately a dead end. Certainly the area has been stale for several years and has degenerated into a dick measuring contest, only instead of dicks you measure parameters, TPUs and metrics on benchmark datasets. Blame transformers which were in a sense us getting a taste of the forbidden fruit, but you know what followed after that.
Of course, out of this you do get some advances useful for the industry, but nothing really of note in the general picture. And it seems to me that lately all these big models that imitate knowledge really well are generating negative sentiment in the population, which may ruin AI.
Freed4ever t1_j3e66lo wrote
Thanks. I'm not a researcher, and more curious about the practicality aspect of the technology. So, the problem is wide, so we cannot formally prove, which is fair. However, if I'm interested in the practicality of the tech, I do not necessarily need a formal proof, I just need it to be good enough. So, just use code generation as an example, it is conceivable that it generates a piece of code, then it actually executes the code and then learn about its accuracy, performance, etc. And hence it is self - taught. Looking at another example like say poetry generation, it is conceivable that it generates a poem, publishes it and then crowd source feedbacks to self teach as well?
suflaj t1_j3e9yim wrote
Well, my first paragraph covers that.
> So, just use code generation as an example, it is conceivable that it generates a piece of code, then it actually executes the code and then learn about its accuracy, performance, etc. And hence it is self - taught.
It doesn't do that. It learns how to have a conversation. The rest is mostly a result of learning things through learning how to model language. Don't give it too much credit. As said previously, it cannot extrapolate.
Think_Olive_1000 t1_j3toxf1 wrote
I think they meant: it is conceivable that in the future it could. i.e. you hook an LLM up with a repl. https://youtu.be/pdSfgRYy8Ao take at look at 15 minutes in. I could easily see how you could fine tune using self appraisal by executing code.
suflaj t1_j3tq0u2 wrote
Sure you could. But the cost is so much it probably outweighs the benefits. And that is even if you made training stable (we already know based on recurrent networks, GANs and even transformers that they're not particularly stable). Hooking it up to the repl would make the task essentially reinforcement learning. And if you know something about reinforcement learning, you know that it generally doesn't work because the environment the agent has to traverse is too difficult to learn anything - what Deepmind managed to achieve with their chess and go engines is truly remarkable, but these are THEIR achievements despite the hardships RL introduces. This is not the achievement of RL. Meanwhile ChatGPT is mostly an achievement of a nice dataset, a clever task and deep learning. It is not that impressive from an engineering standpoint (other than syncing up all the hardware to preprocess the data and train it)
Unless LLMs are extremely optimized in regards to latency and cost, or unless compute becomes even more cheaper (not likely), they have no practical future for the consumer.
So far, it's still a dick measuring contest, as if a larger model and dataset will make much of a difference. I do not see much interest in making them more usable or accessible, I see only effort in beating last year's paper and getting investors to dump more money into a bigger model for next year. I also see ChatGPT as being a cheap marketing scheme all the while it's being used for some pretty nefarious things, some of them being botted Russian or Ukrainian war propaganda.
So you can forget the repl idea. Who would it serve? Programmers have shown they are not willing to pay for something like GitHub Copilot. Large companies can always find people to hire and do programming for them. Unless these are strides in something very expensive, like formal verification, it's not something a large company, the one that has the resources to research LLMs, would go into.
Maybe the next step is training it on WolframAlpha. But at that point you're just catching up to almost 15 year old software. Maybe that "almost 15 year old" shows you how overhyped ChatGPT really is for commercial use.
Think_Olive_1000 t1_j3tqojo wrote
Nah, there's already work that can reduce generic LLM model size by a half and not lose any performance. And LLMs I think will be great as foundation models for training more niche smaller models for narrower tasks - people already use openAIs API to generate data to fine-tune their own niche models. I think we'll look back at current LLMs and realise just how inefficient they were - though a necessary evil to prove that something like this CAN be done.
suflaj t1_j3twskh wrote
Half is not enough. We're thinking in the order of 100x or even more. Do not forget that even ordinary BERT is not really commercially viable as-is.
I mean sure you can use them to get a nicer distribution for your dataset. But at the end of the day the API is too slow to train any "real" model, and you can already probably collect and generate data for smaller models yourself. So as a replacement for lazy people - sure, I think ChatGPT by itself probably has the potential to solve most repetitive questions people have on the internet. But it won't be used like that at scale so ultimately it is not useful.
If it wasn't clear enough by now, I'm not skeptic because of what LLMs are, but how they simply do not scale up to real-world requirements. Ultimately, people do not have datacenters at home, and OpenAI and other vendors do not have the hardware for any actual volume of need other than a niche, hobbyist one. And the investment to develop something like ChatGPT is too big to justify for that use.
All of this was ignoring the obvious legal risks from using ChatGPT generations commercially!
Think_Olive_1000 t1_j3u3k7w wrote
Bert is being used by Google for search under the hood. It's how theyve got that instant fancy extractive answers box. I don't disagree that LLMs are large. So was Saturn V.
suflaj t1_j3u4smq wrote
Google's BERT use is not a commercial, consumer product, it is an enterprise one (Google uses it and runs it on their hardware), they presumably use the large version or something even larger than the pretrained weights available on the internet and to achieve latencies they have they are using datacentres and non-trivial distribution schemes for it, not just consumer hardware.
Meanwhile, your average CPU will need anywhere from 1-4 seconds to do one inference pass in onnx runtime, of course much less on a GPU, but to be truly cross platform you're targetting JS in most cases, which means CPU and not a stack as mature as what Python/C++/CUDA have.
What I'm saying is:
- people have said no to paid services, they want free products
- consumer hardware has not scaled nearly as fast as DL
- even ancient models are still too slow to run on consumer hardware after years of improvement
- distilling, quantizing and optimizing them seems to get them to run just fast enough to not be a nuisance, but is often too tedious to work out for a free product
currentscurrents t1_j3eiw5w wrote
I think you're missing some of the depth of what it's capable of. You can "program" it to do new tasks just by explaining in plain english, or by providing examples. For example many people are using it to generate prompts for image generators:
>I want you to act as a prompt creator for an AI image generator.
>Prompts are descriptions of artistic images than include visual adjectives and art styles or artist names. The image generator can understand complex ideas, so use detailed language and describe emotions or feelings in detail. Use terse words separated by commas, and make short descriptions that are efficient in word use.
>With each image, include detailed descriptions of the art style, using the names of artists known for that style. I may provide a general style with the prompt, which you will expand into detail. For example if I ask for an "abstract style", you would include "style of Picasso, abstract brushstrokes, oil painting, cubism"
>Please create 5 prompts for an mob of grandmas with guns. Use a fantasy digital painting style.
This is a complex and poorly-defined task, and it certainly was not trained on this since the training stops in 2021. But the resulting output is exactly what I wanted:
>An army of grandmas charging towards the viewer, their guns glowing with otherworldly energy. Style of Syd Mead, futuristic landscapes, sleek design, fantasy digital painting.
Once I copy-pasted it into an image generator it created a very nice image.
I think we're going to see a lot more use of language models for controlling computers to do complex tasks.
suflaj t1_j3ek7d0 wrote
> This is a complex and poorly-defined task
Not at all. First of all, ChatGPT does not understand complexity. It would do you well not to think of it like there is some hierarchy. Secondly, there is no requirement of it needing to be well defined. From what I could gather, ChatGPT requires you to convince it it is not giving out an opinion, and then it can hallucinate pretty much anything.
Specifically the task you gave it is likely implicitly present in the dataset, in the sense that the dataset allowed the model to learn the connections between the words you gave it. I hate to break your bubble, but the task is also achievable even with GPT2, a much less expressive model, since it can be represented as a prompt.
It will be easier to see the shortcomings there, but to put it simply, ChatGPT also has them, ex. it does not by default in the genral case differentiate between uppercase and lowercase letters even if it might be relevant for the task. Such things are too subtle for it. Once you realize the biases it has in this regard you being to see through the cracks. Or generally once you give it a counting task, it says it can count but it is not always successful in it.
What is fascinating is the amount of memory ChatGPT has. It is compared to other models very big. But it is limited and it is not preserved outside of the session.
I would say that the people hyping it up probably just do not understand it that well. LLMs are fascinating, yes, but not ChatGPT specifically, it's how malleable the knowledge is. I would advise you to not understand it, because then the magic stays alive. I had a lot of fun for the first week when I was using it, but I never even use it nowadays.
I would also advise you to approach it more critically. I would advise you to first look into how blatantly racist and sexist it is. With that, you can see the reflection of its creators in it. And most of all, I would advise you to focus on its shortcomings. They are easy to find once you start talking to it more like you'd talk with a friend. They will help you use it more effectively.
currentscurrents t1_j3emas4 wrote
>I hate to break your bubble, but the task is also achievable even with GPT2
Is it? I would love to know how. I can run GPT2 locally, and that would be fantastic level of zero-shot learning to be able to play around with.
I have no doubt you can fine-tune GPT2 or T5 to achieve this, but in my experience they aren't nearly as promptable as GPT3/ChatGPT.
>Specifically the task you gave it is likely implicitly present in the dataset, in the sense that the dataset allowed the model to learn the connections between the words you gave it
I'm not sure what you're getting at here. It has learned the connections and meanings between words of course, that's what a language model does.
But it still followed my instructions, and it can follow a wide variety of other detailed instructions you give it. These tasks are too specific to have been in the training data; it is successfully generalizing zero-shot to new NLP tasks.
suflaj t1_j3emtbh wrote
> I would love to know how to do this! I can run GPT2 locally, and that would be fantastic level of zero-shot learning to be able to play around with.
It depends on how much you can compress the prompts. GPT2 is severely limited by memory. This means that you would need to train it on already condensed prompts. But in reality, it has the same (albeit not as refined) capabilities as ChatGPT.
> But it still followed my instructions
Well, it turns out that following instructions can be reduced to a symbol manipulation task. Again, you're giving it too much credit. I do agree that it is wide, but it is not as wide as Google or Wikipedia, which would represent humanity I guess.
> it is successfully generalizing zero-shot to new NLP tasks.
As are lesser models. Transformer based models are fairly successful at it and we have hypothesized this since GPT2, and confirmed it with GPT3. But one thing: technically it generalized few-shot to a new NLP task. It hallucinates on zero shot problems generally or states that it doesn't know. Ask it, for an example, what a "gebutzeripanim" is. I made that up just now.
As for the task you gave it, you cannot claim it is zero shot, as you cannot prove its components were not in the database. Unless you want to say that you're pretty sure the prompt you gave it was not in the database, but hey, that can apply to all generative models, that's what generalization is. But there are tasks it fails on because it just cannot do some things. Ask it to integrate or derive certain functions and you'll quickly see what I mean.
It can tell you all you want to know about integration, it can tell you all the rules perfectly, but it simply cannot apply them as well.
blose1 t1_j3j5vpp wrote
>Ask it, for an example, what a "gebutzeripanim" is. I made that up just now.
Q: what gebutzeripanim is?
A: I'm sorry, but I am unable to find any information about "gebutzeripanim." Could you please provide more context or clarify what you are asking about?
suflaj t1_j3ld7gy wrote
Exactly. It's not exactly meant to be a zero shot model, but rather a conversational agent. If it were a zero shot model, here is what it could say:
- gebutzeripanim is a new phrase for me
- based on its start it sounds german
- based on its middle it sounds vaguely korean
- based on its end it sounds vaguely turkish
- it might be made up slang
But in conversation you wouldn't just start decomposing stuff you never heard about. Here are what it says if you try to fish out these answers:
- I'm sorry, but I'm not familiar with the term "gebutzeripanim." Could you provide more context or clarify what you are referring to?
- Without more context or information about the word, it is difficult for me to determine whether it has a connection to the German language. "Gebutzeripanim" is not a word that I am familiar with, and a search of online resources did not yield any results. It is possible that this is a made-up or invented word, or that it is used in a specific context that is not familiar to me.
- Without more context or information about the word, it is difficult for me to determine whether it has a connection to the Korean language. "Gebutzeripanim" is not a word that I am familiar with, and a search of online resources did not yield any results. It is possible that this is a made-up or invented word, or that it is used in a specific context that is not familiar to me. It is also possible that the word includes elements from multiple languages or that it is a hybrid of several languages. Without more information, it is difficult to say for sure.
- Without more context or information about the word, it is difficult for me to determine whether it has a connection to the Turkish language. "Gebutzeripanim" is not a word that I am familiar with, and a search of online resources did not yield any results. It is possible that this is a made-up or invented word, or that it is used in a specific context that is not familiar to me. It is also possible that the word includes elements from multiple languages or that it is a hybrid of several languages. Without more information, it is difficult to say for sure.
- It is not possible for me to determine whether "gebutzeripanim" is a made-up or invented word, as I do not have sufficient information about the word. It is possible that "gebutzeripanim" is a slang term that is not widely known or used, or that it is a word that has been created recently and has not yet gained widespread recognition. Without more context or information, it is difficult to say for sure what the word means or how it is used.
It mostly bases it answers on the fact that it doesn't know. And given that it uses past answers as part of the prompts you can see that the second time I ask it for a language it assumes it can be made out of more languages, but not because it understands it, only because I mentioned it.
If you ask it in a new session whether it is made out of words or phrases from several languages, it answers with
> I'm sorry, but I am unable to find any information about a word spelled "gebutzeripanim." It is possible that this is a made-up word or a word from a language that I am not familiar with. Can you provide any context or additional information about the word that might help me to better understand it?
Since it basically needs to explicitly see things in training, it's not really a zero-shot, but rather a few-shot model. There are instances where it seems like it can connect the dots but you can't really say it happens in the general case...
Viewing a single comment thread. View all comments