Viewing a single comment thread. View all comments

AsheyDS t1_j0n9xiu wrote

>This is an advantage yes. But it's useless if we don't understand the AI in the way that we want.

Of course, but I don't think making black boxes is the only approach. So I'm assuming one day we'll be able to intentionally make an AGI system, not stumble upon it. If it's intentional, we can figure it out, and create effective control measures. And out of the control measures possible, I think the best option is to create a process, even if it has to be a separate embedded control structure, that will recognize undesirable 'thoughts' and intentions, and have it modify both the current state and memories leading up to it, and re-stitch things in a way that will completely obliterate the deviation.

Another step to this would be 'hard' behavior modification, basically reinforcement behaviors that lead it away from detecting and recognizing inconsistencies. Imagine you're out with a friend and you're having a conversation, but you forgot what you were just about to say. Then your friend distracts you and you forget completely, then you forget that you forgot. And it's gone, without thinking twice about it. That's how it should be controlled.

And what I meant by sandboxing is just sandboxing the short-term memory data, so that if it has a 'bad thought' which could lead to a bad action later, the data would be isolated before it writes to any long-term memory or any other part that could influence behavior or further thought chains. Basically a step before halting it and re-writing it's memory, and influencing behavior. Soft influence would be like your conscience telling you you probably shouldn't do a thing or think a thing, which would be the first step in self-control. The difference is, the influence would come from the embedded control structure (a sort of hybridized AI approach) and would 'spoof' the injected thoughts to appear the same as the ones generated by the rest of the system.

This would all be rather complex to implement, but not impossible, as long as the AGI system isn't some nightmare of connections we can't even begin to identify. You claim Expert systems or rules-based systems are obsolete, but I think some knowledge-based system will be at least partially required for an AGI that we can actually control and understand. Growing one from scratch using modern techniques is just a bad idea, even if it's possible. Expert systems only failed as an approach because of their limitations, but frankly I think they were given up on took quickly. Obviously on it's own it would be a failure because it can't grow like we want it to, but if we updated it with modern approaches and even a new architecture, then I don't see why it should be a dead-end. Only the trend of developing them died. There are a lot of approaches out there and just because one method is now popular while another isn't, doesn't mean a whole lot. AGI may end up being a mashup of old and new techniques, or may require something totally new. We'll have to see how it goes.

1

WarImportant9685 t1_j0ngwf5 wrote

I understand your point. Although we are not on the same page, I believe we are on the same chapter.

I think my main disagreement is that to recognize undesirable 'thoughts' in AI is not such an easy problem. As from my previous comments, one of the holy grail of AI interpretation study is detecting a lying AI which mean we are talking about the same thing! But you are more optimistic than I do, which is fine.

I also understand that we might be able design the AI to use less black-boxy structure to aid AI interpretation. But again I'm not too optimistic about this. I just have no idea how it can be achieved. As at a glance it seems like they are on different abstraction levels. Like if we are just designing the building blocks. How can we dictate how it is going to be used.

Like how are you supposed to design lego blocks, so that it cannot be used to create dragons.

Then again, maybe I'm just too doomer, as alignment problem is unsolved, AGI haven't been solved too. So I agree with you, we'll have to see how it goes.

1