visarga t1_itil03g wrote on October 23, 2022 at 10:19 PM

You don't program AI with "statements", it's not Asimov's positronic brain. What you do instead is to provide a bunch of problems for the AI to solve. These problems should test the alignment, fuzz out the risks. When you are happy with its calibration you can deploy it.

But an interesting and recent development - GPT-3 can simulate people in virtual polls. Provided with the personality profile, it will assume the personality and answer the poll questions from that perspective.

>GPT-3 has biases that are “fine-grained and demographically correlated, meaning that proper conditioning will cause it to accurately emulate response distributions from a wide variety of human subgroups.”

Apparently GPT-3 not only is aligned with humans in general, but it is precisely aligned with each demographic. So it knows our values really well.

The problem is now we have to specify the desired bias we want from it and that's a political problem, not an AI problem. It is ready to oblige and have the bias we want, it's even more aligned than we want, aligned to our stupid things as well.

Smack-works OP t1_itjr7yw wrote on October 24, 2022 at 3:47 AM

I didn't suggest to program it with statements. The statements help to choose the mathematical formalization, the learning method. I added "Recap" part of the post to clarify my point.

Are you familiar with impact measures/impact regularization?

> The problem is now we have to specify the desired bias we want from it and that's a political problem, not an AI problem. It is ready to oblige and have the bias we want, it's even more aligned than we want, aligned to our stupid things as well.

Don't have the time for the link right now, but "it simulates people" sounds absurd as a solution to Alignment:

There's deceptive Alignment.
You need Alignment for AI tasked to achieve superhuman results in the real world. When really harmful solutions are on the table.
If you have no idea how your solution works, it's not a solution. Some unknown model of some people answering questions is not a solution.

> You don't program AI with "statements", it's not Asimov's positronic brain. What you do instead is to provide a bunch of problems for the AI to solve. These problems should test the alignment, fuzz out the risks. When you are happy with its calibration you can deploy it.

What you say is not a solution, it's only a test of a solution.