visarga
visarga t1_itapbde wrote
Reply to comment by AsthmaBeyondBorders in U-PaLM 540B by xutw21
The same CLIP architecture that guides SD to draw pretty images can also guide an industrial robot to accomplish tasks.
visarga t1_itap4rx wrote
Reply to comment by FirstOrderCat in U-PaLM 540B by xutw21
LMs can be coupled with toys - an execution environment to run pieces of code it generates, a search engine, a knowledge base, or even a simulator. They "infuse" strict symbolic consistency into the process creating a hybrid neural-symbolic system.
visarga t1_it96451 wrote
Reply to U-PaLM 540B by xutw21
The idea is actually from a 2021 paper from the same authors. Language models usually predict the next token when they are GPT like, and predict random masked words when they are BERT like. They combine both of them and discover it has a huge impact on scaling laws. In other words we were using the wrong mix of noise to train the model. The new solution is 2x better than before.
This paper combines with the FLAN paper that uses 1800 different tasks to instruction-tune the model. They hope learning many tasks will teach the model to generalise to new tasks. An important trick is using chain of thought, without it there is a big drop. Both methods boost the score and together they get the largest boost.
They even released the FLAN models. Google is on a roll!
I tried FLAN, reminds me of GPT-3 how quickly it gets the task. It doesn't have the vast memory of GPT-3 though. So now I have on my computer a Dall-E like model (SD) and a GPT-3 like model (FLAN-T5-XL), plus an amazing voice recognition system - Whisper. It's hard to believe. After 2 years they shrunk GPT-3 and we have voice, image and language on a regular gaming desktop.
visarga t1_it92bst wrote
In this paper a group of researchers use GPT-3 to simulate a poll. They first collect a bunch of people profiles. They also get profile distribution in their target population from the census data. Then, asking GPT-3 to assume those profiles, they run the poll questions. The result - GPT-3 can predict the poll correctly.
> GPT-3 has biases that are “fine-grained and demographically correlated, meaning that proper conditioning will cause it to accurately emulate response distributions from a wide variety of human subgroups.”
This means we can run any idea, tweet or bullshit against a virtual poll and see how it would fare in a specific population. This is kind-of like running a simulated world who's task is to be your focus group. I think this will catch up. I'm thinking especially politicians, advertisers, startups - they all want to fine-tune their message. Who knows, maybe movie directors would run fake reviews against the virtual focus group before even making the movie, why risk your money on an idea that would turn out bad?
visarga t1_it8wcpw wrote
Reply to comment by Paladia in New open-source language model from Google AI: Flan-T5 🍮 by Ezekiel_W
Currently running it on my desktop, AutoModelForSeq2SeqLM.from_pretrained("ArthurZ/flan-t5-xl")
Seems to be very good at solving tasks that have the necessary information in the prompt, but not as great for general knowledge and code generation compared to GPT-3. I think it could be considered like a mini-GPT-3 you can run on your machine. I'm thinking about doing and agent inside the web browser on top of it + Whisper for speech.
visarga t1_it8pdf2 wrote
Reply to comment by AdditionalPizza in If you believe you can think exponentially, you might be wrong. Transformative AI is here, and it is going to radically change the world before the Singularity, and before AGI. by AdditionalPizza
> It needs to be 10 to 15% of the workforce jobless with no skills outside of their extinct domain.
The number of job positions the economy supports is not hard capped at some maximum value. It's not a zero sum game, more robots doesn't mean less people. But as soon as we get the fruits of this technology we can raise our expectations, and we raise much faster than automation can automate. Just expecting clean air, good food and basic necessities for everyone is a hard task, I bet we'll still be working until we accomplish it.
visarga t1_it8on9t wrote
Reply to comment by DungeonsAndDradis in If you believe you can think exponentially, you might be wrong. Transformative AI is here, and it is going to radically change the world before the Singularity, and before AGI. by AdditionalPizza
The recent Whisper model is rumoured to be created to transcribe all the text from YT in order to feed the next iteration of language modelling.
visarga t1_it8o018 wrote
Reply to comment by ftc1234 in If you believe you can think exponentially, you might be wrong. Transformative AI is here, and it is going to radically change the world before the Singularity, and before AGI. by AdditionalPizza
> Generating intermediate results and trying out possibilities of outcomes is not reasoning.
Could be. People are doing something similar when faced with a novel problem. It doesn't count if you've memorised the best action from previous experience.
visarga t1_it6nwso wrote
Reply to comment by ftc1234 in If you believe you can think exponentially, you might be wrong. Transformative AI is here, and it is going to radically change the world before the Singularity, and before AGI. by AdditionalPizza
> But can it reason by itself without seeing pattern ahead of time? Can it distinguish between the quality of the results it generates? Can it have an opinion that’s not in the mean of the output probability distribution?
Yes, it's only gradually ramping up, but there is a concept of learning from verification. For example AlphaGo learned from self play, but it was trivial to verify who won the game. In math it is possible to plug the solution back to verify it, in code it is possible to run it or apply test driven feedback, with robotics it is possible to run sims and learn from outcomes.
When you move to purely textual tasks it becomes more complicated, but there are approaches. For example if you have a collection of problems (multi-step, complex ones) and their answers, you can train a model to generate intermediate steps and supporting facts. Then you use these intermediate data to generate the answer, an answer you can verify. This trains a model to discover on its own the step by step solutions and solve new problems.
Another approach is to use models to curate the training data. For example LAION-400M is a dataset curated from noisy text-image pairs by generating alternative captions and then picking the best - either the original one or one of the generated captions. So we use the model to increase our training data, that will boost future models in places out of distribution.
So it's all about being creative but then verifying somehow and using the signal to train.
visarga t1_it6ng34 wrote
Reply to comment by kmtrp in If you believe you can think exponentially, you might be wrong. Transformative AI is here, and it is going to radically change the world before the Singularity, and before AGI. by AdditionalPizza
But in reality the opposite seems to happen, we tend to overestimate the impact of a new technology in the short run, but we underestimate it in the long run. We are very emotional about it if it's close and don't care if it's far away.
visarga t1_it6lzdv wrote
Reply to comment by Down_The_Rabbithole in If you believe you can think exponentially, you might be wrong. Transformative AI is here, and it is going to radically change the world before the Singularity, and before AGI. by AdditionalPizza
> physical laborers, especially the underpaid ones like janitorial work, cleaners and miners will be the last jobs to be automated away, not the first.
Automation is coming for everyone, artist, programmer, office worker or physical laborer.
I guess you haven't seen this model - From Play to Policy. With just 5 hours of robot free play they trained a model to control a robotic arm in a kitchen environment. In other words, learning to act (decision transformers) seems to work really well. I expect robotic dexterity to improve quickly. It's just 3-4 years behind text and image.
Related to this I think we'll see large models trained on the entirety of YouTube learning both desktop skills (like automating computer UIs) and robotic skills (like carpentry, cooking and managing a house environment). Massive video models have been conspicuously missing, probably too expensive to train yet, but look for them at the horizon to start popping out.
There's a whole wealth of information in audio-video that is missed in text and image, exactly the kind of information that will automate the jobs you think are safer. And besides video, the simulation field is ramping up with all sorts of 3D environments to train agents in.
visarga t1_it6lkr0 wrote
Reply to comment by Effective-Dig8734 in If you believe you can think exponentially, you might be wrong. Transformative AI is here, and it is going to radically change the world before the Singularity, and before AGI. by AdditionalPizza
Language models are even more accessible than internet and social media. You can talk with them directly, they can teach you what you need to learn, they don't have a discoverability problem like text or image UIs. It's going to be the most natural thing to talk to a LM to solve tasks. And a LM could consistently deliver better quality than internet search and social media. Useful + accessible = quick adoption.
visarga t1_it4qoq7 wrote
Reply to comment by newDeckardCain in A YouTube large language model for a scant $35 million. by Angry_Grandpa_
After text, image and video (+ audio) I think we got all the bases covered. Nobody can claim AI is not grounded anymore. And with this grounding comes a nuanced, semantic understanding of the world. It's like an upload, but not of a person, the whole culture gets to be uploaded at once.
visarga t1_it4q1uj wrote
Reply to comment by Angry_Grandpa_ in A YouTube large language model for a scant $35 million. by Angry_Grandpa_
Train a comment filter. Some comments are great, it depends on the topic very much. In fact, scrap that! Do a GPT4chan and train on the real YT comments. Then instruction-tune the model to be polite. Better to be a polite model but know all the shitty stuff too, to get the jokes.
visarga t1_it4lygh wrote
Reply to comment by ReasonablyBadass in A YouTube large language model for a scant $35 million. by Angry_Grandpa_
Visual data can be described in text, and maybe it's better to do so in order to avoid overfitting to irrelevant details. We have great captioning models for image and video, so we can use them together with speech recognition models. Just imagine a model trained on YT videos playing the sports commentator role - wouldn't it be great to have a virtual commentator for your vids?
But I am excited about training on massive video because it is special - it contains a trove of procedural knowledge, how to do things, step by step. That means you can finetune it later to automate anything you want. Your clumsy robot just got GPT-3 level smarts in practical tasks rarely described in words anywhere.
There was a recent paper - with just 5 hours of robot video and proprioception they trained a transformer to manipulate a toy kitchen and achieve tasks. Pretty amazing, considering The Wozniak threshold of AI: a robot enters a random kitchen and has to make a cup of coffee. There are millions of kitchens on YT, millions of everything in fact.
Looks like "learning to act" is going to be very successful, just like learning to generate text and images. Maybe the handymen won't be the last to be automated.
visarga t1_it323xj wrote
Reply to [D] Discussion Panel for FOSS Instruct by FerretDude
I'd like to see information extraction from semi structured documents like receipts, invoices, forms, contracts, screen shots (apps), etc. The format - question answering, you prompt with a document transcribed in text and a question, get the value in return.
visarga t1_it16573 wrote
Reply to comment by NoRip7374 in Talked to people minimizing/negating potential AI impact in their field? eg: artists, coders... by kmtrp
Let your imagination run wild, what will we do when we become more productive - go home or build things we can't even imagine yet? If we still want more than what is possible today, then how can we afford to send people home? So many grand challenges are far from being solved - global warming, space colonisation, poverty, AI implementations, public education ... we still need people for a while.
visarga t1_it15qub wrote
Reply to comment by freeman_joe in Talked to people minimizing/negating potential AI impact in their field? eg: artists, coders... by kmtrp
I bet we underestimate progress in some ways and overestimate it in other ways. The future is here but unequally distributed. There will still be a need for humans unless AI has cleared that last 1% of accuracy, which is damn hard as we can see from self driving.
visarga t1_it15bg3 wrote
Reply to comment by ginger_gcups in Talked to people minimizing/negating potential AI impact in their field? eg: artists, coders... by kmtrp
> A supply of matter and energy
I think some raw materials are going to be inevitably contested unless we find abundant replacements or reach 100% recycling rate. A replicator won't save us if it needs rare material X.
visarga t1_iszuegi wrote
Reply to comment by gravitas_shortage in [D] GPT-3 is a DREAM for citation-farmers - Threat Model Tuesday #1 by TiredOldCrow
Rather than detecting fakes I'd rather have a model that can generate and implement papers. I bet there's a ton of samples to train on. Close the loop on AI self improvement.
visarga t1_isyfm7d wrote
Reply to comment by ajt9000 in [D] How frustrating are the ML interviews these days!!! TOP 3% interview joke by Mogady
I don't think most candidates have a repo to show, maybe just an empty one.
visarga t1_isqa867 wrote
Reply to comment by Background-Loan681 in Is this imagination? by Background-Loan681
> Does the chatbot imagined something in their head before describing it to me as a prompt?
You're attributing to the model what is the merit of the training data. It's culture that knows what would be a great answer to your task, of course, when culture is loaded up into a brain or an AI.
What I mean is that it doesn't matter the substrate - as long as it learned the distribution, then it can imagine coherent and amazing things. That's all the merit of the training data though. The brain or the model just dutifully carry that in a compact form that can be unfolded in new ways on demand.
visarga t1_isq80ci wrote
Reply to comment by raccoon8182 in Is this imagination? by Background-Loan681
It might surprise you that GPT-3 like models don't have just one bias, one point of view - that of its builders, as often accused.
The model learns all personality types, and emulates their biases to a very fine degree. It is in fact so good that researchers can run simulations of polls on GPT-3. In order to replicate the target population they prompt the model with a collection of personality profiles with the right distribution.
So you, as the user of the model, are in charge. You can make it assume any bias you want, just specify your preferred poison. There is no "absolutely unbiased" mode unless you got that kind of training data. That means the model is a synthesis of all personalities. It's more like humanity than a single person.
visarga t1_isq72rp wrote
Reply to comment by Ortus12 in Is this imagination? by Background-Loan681
> when Open Ai starts plugging in different Ai systems into each other.
Language Model Cascades
https://twitter.com/karpathy/status/1550590818041311232?lang=en
visarga t1_itau1y1 wrote
Reply to comment by very_bad_programmer in New open-source language model from Google AI: Flan-T5 🍮 by Ezekiel_W
Can't seem to get conversation from it, it's a T5 variant, and it seems to be geared towards task solving. There is also a GPT variant they don't release.