
ditlevrisdahl t1_izd4qrp wrote

What techniques did you use to evaluate that your model was actually learning the game?

I can imagine that the first million of episodes the model just produced ramble. So did you just cross you fingers and hoped for some results later? Or did you see steady increase in performance?