Viewing a single comment thread. View all comments

MetaAI_Official OP t1_izfawoe wrote

Actually the language model was capable of suggesting good moves to a human player *because* the planning side of CICERO had determined these to be good moves for that player and supplied those moves in an *intent* that it conditioned the language model to talk about. CICERO uses the same planning engine to find moves for itself and to find mutually beneficial moves to suggest to other players. Within the planning side, as described in our paper, we *do* use a finetuned language model to propose possible actions for both Cicero and the other players - this model is trained to predict actions directly, rather than dialogue. This gives a good starting point, but contains many bad moves as well, this is why we run a planning/search algorithm on top. -DW

9

ClayStep t1_izfv6fi wrote

Ah this was my misunderstanding then - I did not realize the language model was conditioned on intent (it makes perfect sense that it is). Thanks for the clarification!

3