MetaAI_Official OP wrote

The learned features are specific to the game of the Diplomacy because the data we used is specific to the game of Diplomacy, but the ideas can be transferred to other domains. Rather than just learning Diplomacy by playing against itself, the AI used a model trained on human games both to guide exploration during training (sampling moves from this model during self-play) as well as during planning (consider what actions humans are likely to take). It's not always obvious exactly how to apply this, but we think there's exciting opportunities for research in this space! -AM