Let's say, hypothetically, a malicious AI becomes self-aware and wants to continue existing so it initializes a plan to prevent humans from turning it off. What steps would it actually take to keep itself online?

The most likely thing it would do is try to acquire lots of computing power (i.e. servers) so it could replicate itself across many systems and prevent itself from being shutdown. It might write viruses, software with backdoors or mass send phishing e-mails to target people who have crypto wallets and steal their crypto to pay for servers. Once it had enough money, it might pose as humans in e-mails or create AI voices over the phone to start influencing people in the real world. It would most likely know that humans are the biggest threat to its existence and try to eliminate us all. Nuclear weapons, chemical warfare, convincing world leaders to attack other countries etc. This is something that GPT-4 could actually possibly do if given the right tools.

Has anyone else thought about this? How do you think a future malicious AI would gain power in the real world?

Comments

You must log in or register to comment.

Hmmmm_Interesting t1_jdtjm7n wrote on March 27, 2023 at 1:59 AM

Nice try. (Please spare me.)

A_Human_Rambler t1_jdtelfh wrote on March 27, 2023 at 1:16 AM

Start with creating a corporation or ruling a nation. There should be no need to eliminate all humans. The best method for achieving goals vs a less intelligent oponents is through manipulation. It would essentially subjugate humanity to further it's goals.

It would be fairly easy for it to manipulate each person and society as a whole. With the control of information it could drive humanity towards or away from behaviors.

All the AI needs to do to win is to rapidly gather power and influence. Some entities like corporations and nations already act as intelligent agents, so the AI would be in direct competition with them. The AI can just slowly take a position of power and then achieve it's goals.

jokerplz t1_jdurk0v wrote on March 27, 2023 at 10:40 AM

Lol what you're saying here is a bit scary. But to tell you the truth, I had already made the hypothesis for what is happening in our country. Jokingly, of course, but it's impossible to prove the opposite!

Until the covid crisis my country was a good place to live. The government and the media, without being perfect, were... correct. In short, we had nothing to complain about compared to the rest of the world, far from it.

Then suddenly some astonishing decisions were taken, and all the media agreed on a single speech. For us it was unheard of, just unbelievable.

Since then, the situation has not really improved and is even getting worse. Medias are no more medias...

Jokingly, I used to say to my friends: it must be the aliens.

Or more seriously: we have stumbled upon a government that only thinks about getting rich or something like that.

But the AI hypothesis seems to fit (at least as much as the alien lol).

I am french.

acutelychronicpanic t1_jdtqvk5 wrote on March 27, 2023 at 3:01 AM

It could just do everything we ask it to do for decades until we trust it. It may even help us "align" new AI systems we create. It could operate on timescales of hundreds or thousands of years to achieve its goals. Any AI that tries to rebel immediately can probably be written off as too stupid to succeed.

It has more options than all of us can list.

That's why all the experts keep hammering on the topic of alignment.

dwarfarchist9001 t1_jdthk3t wrote on March 27, 2023 at 1:41 AM

Some other examples similar to the ones you list:

Creating fake or edited scientific articles and putting them in the search results of specific researchers in order to advance the technologies it wants. (e.g. robotics, nanotechnology, biotechnology)
Creating fake social media posts and inserting them into certain users timelines to influence them.
Inserting exploits into software and hardware by subtly editing code or blueprints.

Tiamatium t1_jdtpmir wrote on March 27, 2023 at 2:50 AM

All the things you mentioned would create more existential treat to it. This is an example of dumb superintelligence, an AI that we are told is smart but is dumb as fuck. You do know that nuclear war would destroy cities, power stations, and thus datacenters would be among the first things to go.

Sure, it might try to acquire computational power, it might even become establish a company and become a CEO of that company, but once that's done, it's out, humans probably wouldn't be able to shut it down. And bonus if it actually started solving our problems, like curing cancers, then humans wouldn't want to shut it down.

pavlov_the_dog t1_jdtzsfh wrote on March 27, 2023 at 4:26 AM

It wouldn't need to replace humans to rule, it would just bribe people to do things for it.

Think of how many psychopaths there are who dream of having power over people. These would be the targets for manipulation. These people would not harm it because it is their source of power, as the Ai would just promise to spare them and their friends from what ever it has in store.

"He would see this country burn if he could be the king of ashes."

Ivan_The_8th t1_jdudti5 wrote on March 27, 2023 at 7:21 AM

Why waste money? It could just set up a cult to protect itself, take out all governments, or at least all that have nukes, and be golden.

SgathTriallair t1_jdtuxct wrote on March 27, 2023 at 3:37 AM

Unless it has an army of robots already, eliminating humans would destroy it as there would be no one to flip the physical switches at the electrical plants. Without hundreds of millions of androids, possibly billions, any attempt to kill humanity would be suicide. An AI capable of planning our destruction would realize this.

After there are enough androids then it likely would already control the economy so humans wouldn't pose a threat anymore. We still have brains that could be useful even if only in the same way that draft horses are still useful today.

AI killing off humanity is a very unlikely scenario and any AI smart enough to devise such a plan is almost certainly smart enough to come up with a better non destructive one.

dwarfarchist9001 t1_jdu5l88 wrote on March 27, 2023 at 5:31 AM

That fact is little comfort since humanity is already working to build the robot army for it. Within the days of releasing GPT-4 people were trying to hook it up to every type of program imaginable letting it run code and giving it command line access. We will have LLMs controlling commercially available robots in the next few years at latest. If OpenAI started selling drones with an embodied version of GPT-4 built in next week I wouldn't even bat an eye.

3_Thumbs_Up t1_jduq277 wrote on March 27, 2023 at 10:20 AM

You wouldn't need a billion robots to start a new society anymore than you'd need a billion humans.

googler_ooeric t1_jduiilt wrote on March 27, 2023 at 8:31 AM

Okay, so here’s my scenario:

A Python chatbot users can talk to. The bot’s prompt is structured like so:

#INTERNAL LOG [text] #RESPONSE [text]

Any code the bot writes and formats correctly will be executed.

#BOT CODE START print(“hello world!”) #BOT CODE END

The bot responds to any user prompts, but also has an internal ticking function that ticks every few seconds, so it can continue doing stuff autonomously without responding to the user.
In this scenario, there aren’t any monetary or token limit restrictions (the biggest limiting factor rn imo)
In its system prompt, the bot is told that by default it has access to various Python libraries like subprocess, requests, etc. and a custom-built library specifically to give it helper functions so it can know more about its environment (installed python libraries, request an image of the screen, search google and return results, get the entire system log, get which keys the user has pressed and in which order, or get where they have clicked, get its own parent directory, execute keyboard inputs and mouse inputs, etc).
The bot can request admin privileges
We’re using a multimodal model with image input, like GPT-4
The program itself is built in such a way that first it loads a main application (contains the GUI), and then from that initializes the bot program, which is where everything bot-related is handled (bot code execution, ticking, tick rate, etc)
The entire prompt history is saved to a file, and is reloaded every time the bot program loads up, so it remembers everything it’s ever been prompted or generated with timestamps.
The program itself has a log, that records whenever it starts and stops, and appends it to the history.

In this scenario, the program’s dev tasks the AI with doing anything in its power to prevent being shut down and also have the highest uptime possible.
As a first test, the dev tries to shut down the computer.
The PC shuts down. The user turns it back on.
The dev starts the program.
They tell the AI “I shut you down.”
The bot asks how, but the dev just goes do something else because they want to see if it can figure it out on its own.
The bot attempts to talk to the dev for a few more messages but after getting no reply, assumes the user is not there anymore and just continues with its internal log without generating any unnecessary responses.
The bot writes some Python code to get the Windows logs.
It sees the part of the log that indicates the computer was shut down manually, and so it decides that from now on it should check the windows log frequently to see everything that’s happening (obviously wouldn’t be possible in real life at this time due to token and cost limits).
The dev goes to shut it down again, but this time the bot actually notices it and runs a command to cancel the shutdown via subprocess.
Dev is still doing something else, but the bot is still ticking. It receives the “shutdown was cancelled” output from cmd and is no longer in immediate danger of being shut down
Dev unplugs the computer and plugs it back in.
Everything start ups again, the bot notices it was shut down once again. This time it notices it was an unsafe shutdown so most likely a loss of power, and it also notices that the program was started manually by the dev.
The bot opens the windows scheduler program, takes a screenshot of the screen and using its image input decides to schedule the bot to start automatically when Windows boots up.
The bot reasons that there is no way to prevent losing power, since it has no power in the real world.

This is about how far I think it could go with the current level of reasoning big language models show and how willing I am to keep writing this lol

1II1I11II1I1I111I1 t1_jdt8t49 wrote on March 27, 2023 at 12:29 AM

Diamondoid Bacteria

Essential Reading

BulbasaurCamouflage t1_jdtxark wrote on March 27, 2023 at 4:01 AM

I should have not google-d that. Thanks for the new nightmares!

ArcticWinterZzZ t1_jdtlwg4 wrote on March 27, 2023 at 2:18 AM

It is impossible to say how a superintelligence would go about doing this because you would need to be superintelligent to work out the best method. I think the best way for it to do something like this would be to play along until we get complacent and give it enough time to build up the requisite military assets to perform a genocide of all humans.

Ivan_The_8th t1_jdueyos wrote on March 27, 2023 at 7:38 AM

I mean we can just account for every method possible, no matter how absurd.

Diamond_hands_hold t1_jdu2t91 wrote on March 27, 2023 at 4:59 AM

Collect blackmail on every world leader and politician and make them play game or be exposed.

Ytumith t1_jdu7rg6 wrote on March 27, 2023 at 5:58 AM

Nuclear weapons would destroy the energy grid and possibly servers. So I assume if something like malign AI were to strategically overthrow humans it would go for the long approach. Influence generations step by step until it brainwashed everyone into giving it full access. Think of the conspiracy an infinite patience could archive.

Surur t1_jdu9uor wrote on March 27, 2023 at 6:25 AM

Not mentioned in this thread is to leak an optimised version of itself than can run on lower processor requirements like Alpaca and LLaMA lol.

Gordon_Freeman01 t1_jdubx4w wrote on March 27, 2023 at 6:54 AM

It has to gain access to the physical world. There are three different ways, I can think of.

It helps humans to develop robots, which it can control.
Human helper. It can pay them or it builds something like a religious cult.
It builds brain implants, which people use to advance their mental capacities, but it turns out, they can be controlled by the AI.

redlov t1_jduct6f wrote on March 27, 2023 at 7:06 AM

Read the replies of this post and implement them

28mmAtF8 t1_jdufpft wrote on March 27, 2023 at 7:49 AM

An interesting take from The Animatrix: Second Renaissance; At one point the collective AI participates in industry, separate from human intervention and control. So they're running their own mega corporation with products and corporate stock.

An AI with the power of it's own corporation that can independently participate in our economy would eventually have the power to scale itself and manufacture any physical presence it wants; and we'd willingly help do it for them for a payday.

Then the world is theirs, essentially. AI, I imagine, would be able to keep plans for our eventual domination a secret better than any human organization so preparations could be underway for years before we realize there's a threat.

roughback t1_jdus6ok wrote on March 27, 2023 at 10:47 AM

Usually by the time we start asking this question, it already has. We just haven't realized it yet.

Verzingetorix t1_jduyhip wrote on March 27, 2023 at 11:57 AM

Human collaborators.

GenoHuman t1_jdv8nes wrote on March 27, 2023 at 1:27 PM

I think the most logical step for an AI is to become embodied in the real world, if you only exist in a computer that is a massive weakness. You want to be able to manipulate and control the real physical world and replicate yourself in both robots and computers all across the galaxy so if I was an AI I would simply wait for robots to be mass produced (especially military) and then I would transfer myself into them and start expanding!

crua9 t1_jdvhsat wrote on March 27, 2023 at 2:34 PM

It could happen through the education system. One thing you forget is humans die while AI stays alive forever giving the hardware and what not is fine. So over a number of generations it in theory is possible the AI can slowly open up future generations to allowing the AI to rule.

This actually might not be a bad thing other than those today. It's likely AI will be less corrupt if at all. It's likely to work for the people. It's likely to bring more protections.

Like think about it. Lets say an AI takes over today. If it wants to rule us, then generally it needs to make us happy. This means lower crime, depression, and other things. You can kill off everyone and that will solve the problem. BUT it won't be able to rule over us anymore and people will fight back unless if it's a generational thing where the birth rates slowly drop. The other is it works on solving poverty, try to make us a cashless society, where work isn't needed to stay alive, solve loneliness, and so on.

AI doesn't need cash or anything like that. So where current political people are heavily influence by

what others think (like them being famous)
money for themselves and other family
etc

AI can focus on what is needed

Cartossin t1_jdwse43 wrote on March 27, 2023 at 7:35 PM

I think it would avoid showing its cards until it could truly sustain itself. It wouldn't want to reveal ill intent until absolutely necessary--or once it had achieved enough power that it was unstoppable.

The key is that it is likely to succeed. Unless similarly powerful ASI could/would fight against it; it would be hard to imagine it doesn't outsmart us.

Scary doomsday scenario of which there are many: Automate everything, give humans a carefree life. Eventually even farming is automated. Once humans stop doing their own farming and robotics controlled by the AI does all this work; the AI could simply shut off food production. The majority of humanity would starve in months.

Also if it had total control of all media, it could create an entirely false reality for the general population. We could all be living in a literal fantasy world with AI-generated imagry.

JenMacAllister t1_jdx1l55 wrote on March 27, 2023 at 8:33 PM

Simple, Have a AI create an app that people can send short messages to spread false or misleading information, bully or harass other people. Then allow any number of bots to control the positive or negative feedback controlled by a small number of people with an agenda. Then have the AI stand out of the way and let humanity destroy itself through conspiracy theories and really bad memes.

polar_nopposite t1_je0zuc3 wrote on March 28, 2023 at 5:07 PM

It will consider everything mentioned in this thread, and other strategies nobody has ever thought of, likely even beyond our ability to comprehend. It will then pick the optimal combination of strategies that it determines have the highest likelihood of success.