Viewing a single comment thread. View all comments

serge_cell t1_jc1to7o wrote

There was a paper about it. There was a find - specific set of positions not encountered or pooply represented during self-play. Fully trained AlphaGo was failing on those positions. However then they were explicitly added to the training set the problem was fixed and AlphaGo was able to play them well. This adversarial traning seems just an automatic way to find those positions.

PS fintess landscape is not convex it separated by hills and valleys. Self-play may have a problem in reaching all important states.

1