Viewing a single comment thread. View all comments

maskedpaki t1_j3cxzob wrote

dont you think they would test it on things outside training data when doing these tests to avoid misleading people

​

from what Ive heard anthropic have high ethics standards and are primarily into ai safety?

5

overlordpotatoe t1_j3dlot5 wrote

You would think, but if this AI is trained like other AIs where they dump a massive amount of text data into it without necessarily having closely curated it, it would be difficult to know this common riddle wasn't in there somewhere.

5

Homicidal_Duck t1_j3g6ag9 wrote

The point isn't that you'd specifically remove this riddle, or bank on its nonexistence, but more that you'd feed it a riddle that's similar in premise while using little of the same language

2