blueSGL t1_jb6h9jc wrote on March 6, 2023 at 8:15 PM

>Seeing the large variance in the hardware cost/performance of current models, Id think the progression margin for software optimization alone is huge.

>I believe we already have the hardware required for one ASI.

Yep, how many computational "ah-ha" moment tricks are we away from running much better models on the same hardware.

Look at stable diffusion how the memory requirement fell through the floor. We already are seeing similar with LLaMA now getting into public hands (via links from pull requests on Facesbooks github lol) there are already tricks getting implemented in front ends for LLMs that allow for lower VRAM usage.

Baturinsky t1_jb6v5pm wrote on March 6, 2023 at 9:44 PM

I haven't noticed any improvement in memory requirements for 5 months on Stable Diffusion... My RTX2060 still is enough for 1024x640, but not more.

LLaMA does good on tests on small models, but small size could make it not as fit for RLHF.

There is also miniaturisation for inference by reducing precision to int8 or even4. But that does not fit for training, and I believe AGI requires real-time training.

So, in theory, AGI could be achieved even without big "a-ha"-s. Take existing training methods, train on many different domains and data architectures, add tree earch from AlphaGo and real time training - and we probably will be close. But it would require pretty big hardware. And would be "only" superhuman in some specific domains.

fluffy_assassins t1_jb6vmgz wrote on March 6, 2023 at 9:47 PM

I had a kind of a theory.

There used to be self-modifying code in assembler because computing power was more expensive than programmers' time. So programmers took more time to get more out of the more expensive hardware.

I'm thinking, when transistors can't shrink anymore(quantum effects and all), we're going to need to squeeze out all the computing power we can get to the point where... right back to self-modifying code. Though probably done by AI this time. I don't think a human could debug that though!

visarga t1_jb79kst wrote on March 6, 2023 at 11:33 PM

Back-propagation is self-modifying code. There is also meta-back-propagation for meta-learning, which is learning to modify a neural network to solve novel tasks.

At a higher level, language models trained on code can cultivate a population of models with evolutionary techniques.

Evolution through Large Models

NothingVerySpecific t1_jb92tmn wrote on March 7, 2023 at 10:12 AM

I understand some of those words

ahtoshkaa2 t1_jb9czan wrote on March 7, 2023 at 12:21 PM

Same) Haha. Thank god for ChatGPT:

The comment is referring to two different machine learning concepts: back-propagation and meta-back-propagation, and how they can be used to modify neural networks.

Back-propagation is a supervised learning algorithm used in training artificial neural networks. It is used to modify the weights and biases of the neurons in the network so that the network can produce the desired output for a given input. The algorithm uses gradient descent to calculate the error between the predicted output and the actual output, and then adjusts the weights and biases accordingly.

Meta-back-propagation is an extension of back-propagation that is used for meta-learning, which is learning to learn. It involves modifying the neural network so that it can learn to perform novel tasks more efficiently.

The comment also mentions using evolutionary techniques to cultivate a population of models in language models trained on code. This refers to using genetic algorithms to evolve a population of neural networks, where the best-performing networks are selected and combined to create new generations of networks. This process is known as evolution through large models.

vivehelpme t1_jb9flpl wrote on March 7, 2023 at 12:47 PM

>We are nearing self improving code IMO.

Ah, the recurrent boostrap-to-orbit meme. It's just around the corner, behind the self-beating dead horse.

CertainMiddle2382 t1_jb9gdce wrote on March 7, 2023 at 12:54 PM

Hmm, its not like 2023 is a little bit unlike 2020 AI wise.

The very concept of singularity is self improving AI pushing into ASI.

I don’t get how you can trivialize a LLM seemingly starting to show competency in the very programming language it is written into.

What new particular characteristic of an AI would impress you more and show things are accelerating?

I believe humans get desensitized very quickly and when shown an ASI doing beyond standard model physics will still manage to say: so what? Ive been expecting for more since at least 6 months…

vivehelpme t1_jb9jvtl wrote on March 7, 2023 at 1:25 PM

>I don’t get how you can trivialize a LLM seemingly starting to show competency in the very programming language it is written into.

The person who wrote the training code already had competency in that language, that didn't make the AI-programmer duo superhuman.

And then you decide to train the AI on the output of that programmer, so the AI-programmer duo will be just the AI, but from where does it learn to innovate into a superhuman superai super-everything state? It can generalize what a human can do, well, that's good, but its creator could also generalize what a human can do.

Where is the miracle in this equation? You can train the AI on machine code and self modify until perhaps the code is completely impossible to troubleshoot by human beings but the system runs itself on 64 GPUs instead of 256. That makes it cheaper to run, it doesn't make it smarter.

>The very concept of singularity is self improving AI pushing into ASI.

That's an interpretation, a scenario. The core of it all comes from staring at growth graphs too long and realizing that exponential growth might exceed human capacity to follow.

Wikipedia says :

>The technological singularity—or simply the singularity[1]—is a hypothetical future point in time at which technological growth becomes uncontrollable and irreversible, resulting in unforeseeable changes to human civilization.

But how is that really different from:

>The technological singularity—or simply the singularity[1]—is a statistical observation of current state of society where growth at a large scale have resulted in innovation and data collection rates that exceed the unaided human attention span, some claim this might result in unforeseeable changes to human civilization. On a global scale this is generally agreed to have happened around the invention of writing thousands of years ago(as there's exits too much text for anyone to read in a lifetime) but some argue that this coincides with the more invention of the internet as only then did you have the option to interactively access the global state of innovation and progress and realize that you cannot keep up with it even if you spent 24 hours a day reading scientific articles.[2] An online subculture argues that superhuman AI would be require for this statistical observation to be really true(see: no true Scotsman fallacy), despite their own admitted inability to even follow the realtime innovation rate in just their field of worship: AI.

CertainMiddle2382 t1_jb9m08p wrote on March 7, 2023 at 1:43 PM

I dont get your point.

The programmer doesn’t speak Klingon though the program can write good Klingon. AlphaZero programmers don’t play go though the program can beat the best human go players in the world.

By definition being better than a human at something means being « super intelligent » at that task.

Intelligence theory postulates G, and that it can be approximated with IQ test.

« Super intelligent AI » will then by definition only need to show a higher IQ than either its programmers or the smartest human.

Nothing else.

Postulating the existence of G, it is well possible that ASI (by definition again) will be better at other tasks not tested by the IQ test.

Rewriting a better IQ version of itself for example.

Recursively.

I really dont see the discussion here, these are only definitions.

vivehelpme t1_jbep9br wrote on March 8, 2023 at 2:52 PM

>I dont get your point.

I guess my point is superintelligent by your definitions

>The programmer doesn’t speak Klingon though the program can write good Klingon.

It have generalized a human made language.

>AlphaZero programmers don’t play go though the program can beat the best human go players in the world.

https://arstechnica.com/information-technology/2023/02/man-beats-machine-at-go-in-human-victory-over-ai/

It plays at a generalized high-elite level. It's also a one trick pony. It's like saying a chainsaw is superintelligent because it can be used to saw down a tree faster than any lumberjack does with an axe.

>« Super intelligent AI » will then by definition only need to show a higher IQ than either its programmers or the smartest human.

So we could make an alphago that only solve IQ test matrices, it will be superintelligent by your definition but it will be trash at actually being intelligent.

>I really dont see the discussion here, these are only definitions.

Yes, and the definition is that AI is trained on the idea of generalized mimicry, it's all about IMITATION. ~~NOT INNOVATION.~~

This is all there is, you caulculate a loss value based on how far from a human defined gold standard the current iteration lands and edit things to get closer. Everything we have produced in wowy AI is about CATCHING UP to human ability, there's nothing in our theories or neural network training practices that are about EXCEEDING human capabilities.

The dataset used to train a neural network is the apex of performance that it can reach. You can at best land at a generalized consistently very smart human level.

CertainMiddle2382 t1_jbeq0si wrote on March 8, 2023 at 2:57 PM

You are obviously mistaken.

As you know well zero shot learning algorithms beat anything else, saw a DeepMind analysis postulating that it allows them to explore part of the gaming landscape that were never explored by humans.

And you seem to be moving lampposts as you move along.

What is the testable characteristics that would satisfy you to declare the existence of an ASI?

For me it is easy, higer IQ than any living human, by defnition. Would that change something, you can argue it doesnt, I bet it will change everything.

vivehelpme t1_jbexk7n wrote on March 8, 2023 at 3:47 PM

>As you know well zero shot learning algorithms beat anything else

It doesn't create a better training set out of nothing.

> it allows them to explore part of the gaming landscape that were never explored by humans.

Based on generalizing a premade dataset, made by humans.

If an AI could just magically zero-shot a better training set out of nowhere we wouldn't bother making a training set, just initailize everything to random noise and let the algorithm deus-ex-machina it to superintelligence out of randomness.

>What is the testable characteristics that would satisfy you to declare the existence of an ASI?

Something completely independent is a good start for calling it AGI and then we can start thinking if ASI is a definition that matters.

>For me it is easy, higer IQ than any living human, by defnition. Would that change something, you can argue it doesnt, I bet it will change everything.

So IQ test solving AI are superintelligent despite not being able to tell a truck apart from a house?

What might slow this down?

CertainMiddle2382 t1_jb61410 wrote on March 6, 2023 at 6:25 PM