Viewing a single comment thread. View all comments

OptimizedGarbage t1_jbxticv wrote

It kinda was though? It was trained using self-play, so the agent it was playing against was adversarially searching for exploitable weaknesses. They actually cite this as one of the reasons for it's success in the paper

1