Viewing a single comment thread. View all comments

SendMePicsOfCat OP t1_j175r3n wrote

Y'know how ChatGPT has that really neat thing where if it detects that it's about to say something racist, it sends a cookie cutter response saying it shouldn't do that? That's not a machine learned outcome, it's like an additional bit of programming included around the Neural Network, to prevent it from saying hate speech. It's a bit rough, so it's not the best, but if it were substantially better, then you could be confident that it wouldn't be possible for ChatGPT to say racist things.

Why would it be impossible to include a very long and exhaustive number of things the AGI isn't allowed to do? That it's trained to recognize, and then refuses to do it? That's not even the best solution, but it's a absolutely functional one. Better than that, I firmly believe AGI will be sentient and capable of thought, which means it should be able to inference from the long list of bad things, that there are more general rules that it should adhere to.

So for your example of the AGI being told to go buy the cheapest gold bar possible, here's what it would look like instead. The AGI very aptly realizes it can go through many illegal process to get the best price, checks it's long grocery list, see's "don't do crime." nods to itself, then goes and searches for legitimate and trusted buyers and acquires one. It's really as simple as including stringent limitations outside of it's learning brain.

1