visarga

visarga t1_j4cqkkb wrote

Sometimes you can exploit asymmetrical difficulty. For example, factorising polynomials is hard but multiplying a bunch of degree 1 polynomials is easy. So you can generate data for free, and it will be very diverse. The data is such that is has a compositional structure, it will necessitate applying rules correctly without overfitting.

Taking derivatives and integrals is similar - easy one way, hard the other way. And solving the task will teach the model something about symbolic manipulation.

More generally you can use an external process, a simulator, an algorithm or a search engine to obtain a transformation of input X to Y, then learn to predict Y from X or X from Y. "Given this partial game of chess, predict who wins" and such. If X has compositional structure, solving the task would teach the model how to generalise, because you can generate as much data as necessary to force it not to overfit.

2

visarga t1_j419sn0 wrote

MS failed the search, abandoned the browser, missed the mobile, now they want to hit. It's about not fucking up again.

I don't think the GPT-3 model itself is a moat, someone will surpass it and make a free version soon enough. But the long term strategy is to become a preferred hosting provider. In a gold rush, sell shovels.

1

visarga t1_j4157w3 wrote

Of course the code fails at first run. My code fails at first run, too. But I can iterate. If MS allows feedback from the debugger, the model could fix most of its errors.

And when you want to solve a quantitative question the best way is to ask for a Python script that would print the answer when executed.

2

visarga t1_j3j58vc wrote

Reply to comment by turnip_burrito in Organic AI by Dramatic-Economy3399

I'd like to have an AI chatbot or assistant in the web browser to summarise, search, answer and validate stuff. Especially when the search results are full of useless ads and crap, I don't want to see them anymore. But I want vadliation.

This AI assistant will be my "kid" (run on my own machine) and listen to my instructions, not Google's or anyone else's. Any interaction with it remains private unlike web search. It should run efficiently on a normal desktop like Stable Diffusion - that will be the hardest part. Go Stability.ai!

5

visarga t1_j3ecrx8 wrote

Yes, I agree traditional NLP tasks are mostly solved, a possibly large number of new skills unlocked at once. And they work so well without fine-tuning, just from the prompt.

So take your task to chatGPT (or text-davinci-003), label your dataset or generate more data. Then you finetune a slender transformer from Huggingface. You got an efficient and cheap model.

9

visarga t1_j39xs2x wrote

No, this concept is older, it predates Google. Hinton was working on it in 1986 and Schmidhuber in 1990s. By the way, "next token prediction" is not necessarily state of the art. The UL2 paper showed it is better to use a mix of masked spans.

If you follow the new papers, there are a thousand ideas floating around. How to make models learn better, how to make them smaller, how to teach the network to compose separate skills, why training on code improves reasoning skills, how to generate problem solutions as training data... we just don't know which are going to matter down the line. It takes a lot of time to try them out.

Here's a weird new idea: StitchNet: Composing Neural Networks from Pre-Trained Fragments. (link) People try anything and everything.

Or this one: Massive Language Models Can Be Accurately Pruned in One-Shot. (link) - maybe it means we will be able to run GPT-3 size models on a gaming desktop instead of a $150,000 computer

2

visarga t1_j36i9o1 wrote

> I almost worry more about the overall level of conscious happiness throughout all of time and space throughout all dimensions/simulations/realities because that is the ONLY thing that ultimately matters in the end

This doesn't make sense from an evolutionary point of view. There's no big brotherhood of conscious entities, it's competition for resources.

2

visarga t1_j36ccg4 wrote

> Innatera claims 10000x lower power usage with their chip.

Unfortunately it's just a toy. Not gonna run GPT-3 on edge.

Googled for you: Innatera's third-generation AI chip has 256 neurons and 65,000 synapses and runs inference at under 1 milliwatt, which doesn't sound like a lot compared to the human brain, which has 86 billion neurons and operates at around 20 watts.

4