Viewing a single comment thread. View all comments

icosaplex t1_iux2lf3 wrote

For reference, self-play typically uses 1500 visits per move right now, rather than 600. (That is, on the self-play examples that are recorded for training. The rollout of the game trajectory between them uses fewer).

I would not be so surprised if you could scale up the attack to work at that point. It would be interesting. :)

In actual competitions and matches, i.e. full-scale deployment, the number of visits used per move is typically in the high millions or tens of millions. This is in part why the neural net for AlphaZero board game agents is so tiny compared to models in other domains (e.g. #parameters measured in millions rather than billions). It's because you want to make them fast enough to query a large number of times at inference.

I'm also very curious to know how much the attack is relying specifically the kind of adversarial exploitation that is like image misclassification attacks almost impossible to fix, versus relying on the neural net being undertrained in these kinds of positions in a way that is easy to simply train.

For example, if the neural net were trained more on these kinds of positions both to predict not to pass initially, and to predict that the opponent will pass in response, and then frozen, does it only gain narrow protection and still remains just as vulnerable, just needing a slightly updated adversary? Or does it become broadly robust to the attack? I think that's a thing that would be highly informative to understanding the phenomenon, just as much if not moreso than simply scaling up the attack.

2