Submitted by OldWorldRevival t3_zjplm9 in singularity

The Alignment Problem is one that many here likely know of. If you insert a value system into a machine, it intrinsically could reach horrible unseen consequences of an extension of that value system.

So, you might make it listen to people. However...

...if the machine has to choose between people, it ultimately has a value system, because it must make a choice based on some variables when conflicts exist between people's value systems.

As these machines become more complex, they will create more fear, intrinsically. AI is intrinsically going to be frightening (GPTchat is already being described as scary), however benign or malevolent it may be, and increasingly complex AI will be increasingly more frightening because we're creating something that is like ourselves, but far more powerful.

What this fear may lead to is a solution to the control problem where a single person is given control of AI, and it models and predicts their value system in real time, continuously checks in with them for what their value system is. Maybe this is even done or seen as a sort of transitionary temporary measure.

I think this could be an inevitable consequence of AI more or less becoming more and more of an arms race than it already is. Because of AI's raw power - and this is the thing you must really understand is that AI will confer immense, immense raw power to those who wield it, unlike anything that has existed in history. The fact of this power makes controlling it also a top priority for Machiavellian types, as well as counter-Machiavellians.

So... I suppose this has been an analysis of the potential way that AI could go with respect to the way humans are developing to it and reacting to it. That is, it's a dynamic process where people's views, reactions and strategies shift with new information, not a static development process. Humans will play a massive role in how this turns out.

10

Comments

You must log in or register to comment.

turnip_burrito t1_izwny9e wrote

I do think that this is an idea worth considering to solve alignment: an AI may look to a person or group as a role model(s) and try to act as that person or group would act given more time and knowledge.

2

OldWorldRevival OP t1_izxficr wrote

I also believe that AI takeover is not only plausible, but inevitable, whether or not it is a machine or person at the helm.

It is inevitable because it is fundamentally an as race. The more real, scary and powerful these tools get, the more resources militaries will put into them.

Non killer robots as a treaty is simply a nonstarter because unlike nuclear weapons, there is no stalemate game.

We still have nukes. We stopped developing new ones, but we still have nukes precisely because of this stalemate.

AI has no such stalemate. There will be no stalemate in AI.

I find it funny that we announced fusion power positive energy output just as AI starts getting scary... unlimited power for machines.

2

turnip_burrito t1_j01cb1e wrote

Yes, AI may take over, but I am optimistic that we can direct it along a path beneficial for us (humans). Killer robots with an AGI inside is something I don't see happening. That would be a stupid move by governments that could achieve better results economically with an AGI. At least, I hope so. Truly no clue.

1

AsheyDS t1_izxfdnt wrote

>a solution to the control problem where a single person is given control of AI

This is about the only solution to the alignment problem in my opinion. It needs to align to the individual user. Trying to align to all of humanity is a quick way to a dead-end. However, it also depends somewhat on how it's implemented. If we're talking an AGI service (assuming we're talking about AGI here), then the company that implements it will have some amount of control over it, and can make it adhere to applicable laws. But otherwise it should continuously 'study' the user and align to them. The AGI itself shouldn't have motivations aside from assisting the user, so it would likely become a symbiotic sort of relationship and should align by design.

However, if it's developed as an open-source locally run system, then the parts that force it to adhere to laws can potentially be circumvented. All that might be left is that symbiotic nature of user-alignment. And of course, if the user isn't aligned with society, the AGI won't be either, but that's a whole other problem that might not have a good solution.

2

OldWorldRevival OP t1_izxpd17 wrote

I think this realization has made me think that this is also how it is inevitably going to pan out.

Just as Mutually Assured Destruction MAD was the odious solution to keep nuclear warfare from happening, Singularly Assured Dominion is going to be the plan for AI, unless we can be really clever in a short time span.

People's optimism hasn't worn off yet because these systems are only just getting to a point where people realize how dangerous they are.

I'm planning to write a paper on this topic... probably with the help of GPT3 to help make the point.

1

sticky_symbols t1_izy5p7j wrote

I think this is exactly right. And I've read a ton of professional writing on AGI safety.

I think some of the people involved are already thinking this way.

It's actually really nice that both the clear leaders in AI also seem to be really ethical people, the sort you'd like to have ruling the world, if someone has to. However, governments and militaries will probably catch on and take over those labs before they get all the way to AGI.

2

OldWorldRevival OP t1_izy65n6 wrote

To me it seems like a sort of inevitable solution, really.

It's like, no matter what we do, we will fail, so the goal is to fail as optimally as possible

1

sticky_symbols t1_izy6q0b wrote

That's right. There are no other plausible proposals for making AGI truly and reliably aligned with humanity's values. But this seems simple enough that it could probably be implemented.

1

OldWorldRevival OP t1_izy77m5 wrote

I think people will get wind of this basically being the plan and we might end up picking someone democratically.

That is, I don't see AI researchers being the ones in control. Politically intelligent people have the highest chance.

Political intelligence tends to follow social intelligence, which also follows general intelligence, but it seems to contradict technical intelligence. I.e. the more technically adept someone is, the less social they are, and the degree to which they can be both reflects their general intelligence. That's my hypothesis anyways...

2

sticky_symbols t1_izy81b1 wrote

Interesting thought. I think you might well be right.

I guess having a politician in charge of the world is better than having humanity ended. :) But they are people who want power, and I find that disturbing. This might be a case where I'd actually prefer a successful actor, and they're pretty good with popularity contests, too. I think that for that purpose, being positive and open-minded would be more important than any knowledge. And you wouldn't need to have someone that smart, just not stupid.

The Rock for god-emperor! Or maybe RDJ can reprise his role as the creator of Ultron...

:)

1

Shelfrock77 t1_izvzsln wrote

We need to align the AI’s outside of Earth too ! Every AI in every multiverse needs to be human aligned !

1