Viewing a single comment thread. View all comments

sdmat t1_iyc8ab6 wrote

> The problem of producing task specifications does not get worse with AI intelligence (because as we've already seen, the difficulty of producing a specification is independent) which is fundamentally inconsistent with the LessWrongist viewpoint.

I think LW viewpoint is that for the correctness of a task specification to be genuinely independent of the AI it is necessary to include preferences that cover the effects of all possible ways to execute the task.

The claim is that for our present AIs we don't need to be anywhere near this specific only because they can't do very much - we can accurately predict the general range of possible actions and the kinds of side effects they might cause in executing the task, so only need to worry about whether we get useful results.

Your view is that this is refuted by the existence of approaches that generate a task specification and check execution against the specification. I don't see how that follows - the LW concern is precisely that this kind of ad-hoc understanding of what we actually mean by the original request is only safe for today's less capable systems.

1