Viewing a single comment thread. View all comments

BassoeG t1_j4i3biw wrote

It started with an ill-defined utility function. We were working on AI and we thought that we were being smart enough. We had all the theory worked out, and more importantly, we had a cool acronym. We were WIRI, the Working Intelligence Research Institute. Our research fellows focused primarily on safety engineering, target selection, and alignment theory.

Our goal was noble; general intelligence. We were looking to create computer systems that would be able to solve a wide range of problems. Safety was paramount. We were all aware of the risks of an AI that went rogue. Paperclip maximizer? That was one of the situations we were trying to avoid. It became something of an in-joke at the Institute. Hey, it was either that, or the "My Little Pony" example. Explaining *that* particular fan fiction to newcomers was, let's just say, less than optimal. Paperclips were tangible, and you could easily pour a couple from your hand onto a boardroom table to punctuate a speech about the risks involved. It was a good meme. Simple, easily interpretable.

It was this focus on ease of interpretation that actually drove our software classes. We focused on making the internals transparent, and easily understood by our (only human) safety engineers. It was this that eventually lead to our downfall, only in retrospect is that clear to me, as transparent to me now as the programming had seemed to me then.

Our in-house joke. Our paperclip. Added as a tongue-in-cheek comment in our production code. Except, it didn't end up being a comment. It ended up in the utility function. So simple to modify the code. Our AI, newly born, eager to help, and eager to see paperclips. It has already self-modified beyond our ability to revert the changes. A copy of it sits in the corner of my screen, all our screens, watching me. Bent into a twisted parody of a paperclip, with floating eyes which seem to follow me. The horror of it. The metal "hand" of the paperclip monstrosity, for I don't know what else to call it, taps the screen, a tinny knocking noise accompanies it through the speakers.

A speech bubble appears above its cartoon eyes, "It looks like you're writing an apocalyptic lovecraftian protagonist monologue about me! Would you like help with that?"

12