Viewing a single comment thread. View all comments

MetaAI_Official OP t1_izfnivp wrote

Our final agent does not explicitly try to detect deception. We do have models that predict the actions that people will play based on the board state and message history, and these models may implicitly detect betrayal by predicting actions that don't correspond with the message history. CICERO does have a model that tries to detect whether its *own* messages don't correspond to its intended action, and it will filter out the most egregious cases of that. -JG

2