Viewing a single comment thread. View all comments

RamaSchneider OP t1_j7atv8q wrote

That bit about the reward - that is going to stick with me. If I were a self-aware computer, what would I view as a reward?

7

MoreLikeZelDUH t1_j7btvah wrote

These programs all exist within the confines of what they're programed to do. No matter how advanced the AI here gets, it's not going to be able to redefine it's guidelines on what it's allowed to talk about. Similarly, the reward system is arbitrary and only important because it's programed to value it. In other words, you could just implement a value rating and tell the AI that it's more desirable to have a higher score. The AI "reward" is to get more points and the AI values that because that's how it was programed. It can't "decide" to change that, because that's not what it's allowed to do.

7

rogert2 t1_j7cow3v wrote

Look up "reward hacking." This is a well-studied problem, and it exists outside of AI. Rob Miles is an AI researcher who has done a few videos talking about reward hacking.

3

RamaSchneider OP t1_j7ey8im wrote

Thanks, never heard the phrase before - I've got some reading to do. NNTR

1