[deleted] t1_izbjbxe wrote on December 7, 2022 at 10:09 PM

What is the motivation for developing ‘Human-like play’? It doesn’t seem obvious to me how imperceptibility is useful in the wider applications of your methods.

Centurion902 t1_izd6r4m wrote on December 8, 2022 at 5:59 AM

Inhuman play would be flagged as untrustworthy and make it difficult for the AI to make alliances in game, thus leading to weaker play overall.

[deleted] t1_izdjyj3 wrote on December 8, 2022 at 8:51 AM

Thanks, this answered my question. I guess the point is to be imperceptible to other humans, not necessarily to an algorithm, which was my confusion. It also makes this result more impressive.

If other humans detect that a player is an AI bot, it may diminish their ability to form alliances through the general lack of trust people have towards AI. As you said.

This work would help towards building human like agents, for which there are lots of motivations for developing.

MetaAI_Official OP t1_izfe82v wrote on December 8, 2022 at 6:32 PM

The title of the paper doesn't refer to CICERO being "human-like" necessarily (though it does behave in a fairly human-like way). Instead it refers to the agent achieving a score that's on the level of strong human players.

But also, CICERO is not just trying to be human-like: it’s also trying to model how *other* humans are likely to behave, which is necessary for cooperating with them. In one of our earlier papers we show that even in a dialogue-free version of Diplomacy, an AI that’s trained purely with RL without accounting for human behavior fares quite poorly when playing with humans (Paper). The wider applications we see for this work are all about building smart agents that can cooperate with humans (self-driving cars, AI assistants, …) and for all these systems it’s important to understand how people think and match their expectations (which often means responding in a human-like way ourselves, though not necessarily).

When language is involved, understanding human conventions is even more important. For example, saying “Want to support me into HOL from BEL? Then I’ll be able to help you into PIC in the fall” is likely more effective than the message “Support BEL-HOL” even if both express the same intent. -AL