currentscurrents t1_jbwfbdd wrote on March 12, 2023 at 6:15 AM

TL;DR they trained an adversarial attack against AlphaGo. They used an optimizer to find scenarios where the network performed poorly. Then a human was able to replicate these scenarios in a real game against the AI.

The headline is kinda BS imo; it's a stretch to say it was beat by a human since they were just following the instructions from the optimizer. But adversarial attacks are a serious threat to deploying neural networks for anything important, we really do need to find a way to beat them.

serge_cell t1_jbwt0s9 wrote on March 12, 2023 at 9:17 AM

It's a question of training. AlphaGo was not trained agains adversarial attacks. If it was the whole family of attacks wouldn't work, and new adversarial traning would be order of magnitude more difficult. It's a shield and sword again.

Excellent_Dirt_7504 t1_jbwwi8v wrote on March 12, 2023 at 10:06 AM

If you train against one attack, you remain vulnerable to another. There is no evidence of a defense that is robust to any adversarial attack.

suflaj t1_jbx9h57 wrote on March 12, 2023 at 12:53 PM

But there is evidence of a defense by taking as many adversarial attacks as possible and training against them. Ultimately, the ultimate defense is generalization. We know it exists, we know it is achievable, we only don't know HOW it's achievable (for non-trivial problems).

OptimizedGarbage t1_jbxticv wrote on March 12, 2023 at 3:41 PM

It kinda was though? It was trained using self-play, so the agent it was playing against was adversarially searching for exploitable weaknesses. They actually cite this as one of the reasons for it's success in the paper

serge_cell t1_jc1tqaq wrote on March 13, 2023 at 12:55 PM

see previous response

ertgbnm t1_jbyocgi wrote on March 12, 2023 at 7:14 PM

Isn't alphago trained against itself? So I would consider it adversarial training.

serge_cell t1_jc1to7o wrote on March 13, 2023 at 12:54 PM

There was a paper about it. There was a find - specific set of positions not encountered or pooply represented during self-play. Fully trained AlphaGo was failing on those positions. However then they were explicitly added to the training set the problem was fixed and AlphaGo was able to play them well. This adversarial traning seems just an automatic way to find those positions.

PS fintess landscape is not convex it separated by hills and valleys. Self-play may have a problem in reaching all important states.