visarga

visarga t1_itap4rx wrote

Reply to comment by FirstOrderCat in U-PaLM 540B by xutw21

LMs can be coupled with toys - an execution environment to run pieces of code it generates, a search engine, a knowledge base, or even a simulator. They "infuse" strict symbolic consistency into the process creating a hybrid neural-symbolic system.

5

visarga t1_it96451 wrote

Reply to U-PaLM 540B by xutw21

The idea is actually from a 2021 paper from the same authors. Language models usually predict the next token when they are GPT like, and predict random masked words when they are BERT like. They combine both of them and discover it has a huge impact on scaling laws. In other words we were using the wrong mix of noise to train the model. The new solution is 2x better than before.

This paper combines with the FLAN paper that uses 1800 different tasks to instruction-tune the model. They hope learning many tasks will teach the model to generalise to new tasks. An important trick is using chain of thought, without it there is a big drop. Both methods boost the score and together they get the largest boost.

They even released the FLAN models. Google is on a roll!

I tried FLAN, reminds me of GPT-3 how quickly it gets the task. It doesn't have the vast memory of GPT-3 though. So now I have on my computer a Dall-E like model (SD) and a GPT-3 like model (FLAN-T5-XL), plus an amazing voice recognition system - Whisper. It's hard to believe. After 2 years they shrunk GPT-3 and we have voice, image and language on a regular gaming desktop.

14

visarga t1_it92bst wrote

In this paper a group of researchers use GPT-3 to simulate a poll. They first collect a bunch of people profiles. They also get profile distribution in their target population from the census data. Then, asking GPT-3 to assume those profiles, they run the poll questions. The result - GPT-3 can predict the poll correctly.

> GPT-3 has biases that are “fine-grained and demographically correlated, meaning that proper conditioning will cause it to accurately emulate response distributions from a wide variety of human subgroups.”

This means we can run any idea, tweet or bullshit against a virtual poll and see how it would fare in a specific population. This is kind-of like running a simulated world who's task is to be your focus group. I think this will catch up. I'm thinking especially politicians, advertisers, startups - they all want to fine-tune their message. Who knows, maybe movie directors would run fake reviews against the virtual focus group before even making the movie, why risk your money on an idea that would turn out bad?

https://jack-clark.net/2022/10/11/import-ai-305-gpt3-can-simulate-real-people-ai-discovers-better-matrix-multiplication-microsoft-worries-about-next-gen-deepfakes/

2

visarga t1_it8wcpw wrote

Currently running it on my desktop, AutoModelForSeq2SeqLM.from_pretrained("ArthurZ/flan-t5-xl")

Seems to be very good at solving tasks that have the necessary information in the prompt, but not as great for general knowledge and code generation compared to GPT-3. I think it could be considered like a mini-GPT-3 you can run on your machine. I'm thinking about doing and agent inside the web browser on top of it + Whisper for speech.

13

visarga t1_it8pdf2 wrote

> It needs to be 10 to 15% of the workforce jobless with no skills outside of their extinct domain.

The number of job positions the economy supports is not hard capped at some maximum value. It's not a zero sum game, more robots doesn't mean less people. But as soon as we get the fruits of this technology we can raise our expectations, and we raise much faster than automation can automate. Just expecting clean air, good food and basic necessities for everyone is a hard task, I bet we'll still be working until we accomplish it.

2

visarga t1_it8o018 wrote

> Generating intermediate results and trying out possibilities of outcomes is not reasoning.

Could be. People are doing something similar when faced with a novel problem. It doesn't count if you've memorised the best action from previous experience.

1

visarga t1_it6nwso wrote

> But can it reason by itself without seeing pattern ahead of time? Can it distinguish between the quality of the results it generates? Can it have an opinion that’s not in the mean of the output probability distribution?

Yes, it's only gradually ramping up, but there is a concept of learning from verification. For example AlphaGo learned from self play, but it was trivial to verify who won the game. In math it is possible to plug the solution back to verify it, in code it is possible to run it or apply test driven feedback, with robotics it is possible to run sims and learn from outcomes.

When you move to purely textual tasks it becomes more complicated, but there are approaches. For example if you have a collection of problems (multi-step, complex ones) and their answers, you can train a model to generate intermediate steps and supporting facts. Then you use these intermediate data to generate the answer, an answer you can verify. This trains a model to discover on its own the step by step solutions and solve new problems.

Another approach is to use models to curate the training data. For example LAION-400M is a dataset curated from noisy text-image pairs by generating alternative captions and then picking the best - either the original one or one of the generated captions. So we use the model to increase our training data, that will boost future models in places out of distribution.

So it's all about being creative but then verifying somehow and using the signal to train.

2

visarga t1_it6ng34 wrote

But in reality the opposite seems to happen, we tend to overestimate the impact of a new technology in the short run, but we underestimate it in the long run. We are very emotional about it if it's close and don't care if it's far away.

5

visarga t1_it6lzdv wrote

> physical laborers, especially the underpaid ones like janitorial work, cleaners and miners will be the last jobs to be automated away, not the first.

Automation is coming for everyone, artist, programmer, office worker or physical laborer.

I guess you haven't seen this model - From Play to Policy. With just 5 hours of robot free play they trained a model to control a robotic arm in a kitchen environment. In other words, learning to act (decision transformers) seems to work really well. I expect robotic dexterity to improve quickly. It's just 3-4 years behind text and image.

Related to this I think we'll see large models trained on the entirety of YouTube learning both desktop skills (like automating computer UIs) and robotic skills (like carpentry, cooking and managing a house environment). Massive video models have been conspicuously missing, probably too expensive to train yet, but look for them at the horizon to start popping out.

There's a whole wealth of information in audio-video that is missed in text and image, exactly the kind of information that will automate the jobs you think are safer. And besides video, the simulation field is ramping up with all sorts of 3D environments to train agents in.

12

visarga t1_it6lkr0 wrote

Language models are even more accessible than internet and social media. You can talk with them directly, they can teach you what you need to learn, they don't have a discoverability problem like text or image UIs. It's going to be the most natural thing to talk to a LM to solve tasks. And a LM could consistently deliver better quality than internet search and social media. Useful + accessible = quick adoption.

8

visarga t1_it4lygh wrote

Visual data can be described in text, and maybe it's better to do so in order to avoid overfitting to irrelevant details. We have great captioning models for image and video, so we can use them together with speech recognition models. Just imagine a model trained on YT videos playing the sports commentator role - wouldn't it be great to have a virtual commentator for your vids?

But I am excited about training on massive video because it is special - it contains a trove of procedural knowledge, how to do things, step by step. That means you can finetune it later to automate anything you want. Your clumsy robot just got GPT-3 level smarts in practical tasks rarely described in words anywhere.

There was a recent paper - with just 5 hours of robot video and proprioception they trained a transformer to manipulate a toy kitchen and achieve tasks. Pretty amazing, considering The Wozniak threshold of AI: a robot enters a random kitchen and has to make a cup of coffee. There are millions of kitchens on YT, millions of everything in fact.

Looks like "learning to act" is going to be very successful, just like learning to generate text and images. Maybe the handymen won't be the last to be automated.

5

visarga t1_it323xj wrote

I'd like to see information extraction from semi structured documents like receipts, invoices, forms, contracts, screen shots (apps), etc. The format - question answering, you prompt with a document transcribed in text and a question, get the value in return.

6

visarga t1_it16573 wrote

Let your imagination run wild, what will we do when we become more productive - go home or build things we can't even imagine yet? If we still want more than what is possible today, then how can we afford to send people home? So many grand challenges are far from being solved - global warming, space colonisation, poverty, AI implementations, public education ... we still need people for a while.

2

visarga t1_isqa867 wrote

> Does the chatbot imagined something in their head before describing it to me as a prompt?

You're attributing to the model what is the merit of the training data. It's culture that knows what would be a great answer to your task, of course, when culture is loaded up into a brain or an AI.

What I mean is that it doesn't matter the substrate - as long as it learned the distribution, then it can imagine coherent and amazing things. That's all the merit of the training data though. The brain or the model just dutifully carry that in a compact form that can be unfolded in new ways on demand.

1

visarga t1_isq80ci wrote

It might surprise you that GPT-3 like models don't have just one bias, one point of view - that of its builders, as often accused.

The model learns all personality types, and emulates their biases to a very fine degree. It is in fact so good that researchers can run simulations of polls on GPT-3. In order to replicate the target population they prompt the model with a collection of personality profiles with the right distribution.

So you, as the user of the model, are in charge. You can make it assume any bias you want, just specify your preferred poison. There is no "absolutely unbiased" mode unless you got that kind of training data. That means the model is a synthesis of all personalities. It's more like humanity than a single person.

5