Viewing a single comment thread. View all comments

Lairv t1_ir7qmlc wrote

Cool paper, worth noting that such systems requires huge resources to be trained, they quickly mention it in the appendix "1.600 actors TPUv4 to play games, and 64 TPUv3 to train the networks, during a week". For reference, AlphaZero for Go was trained with 5.000 actors TPUv1 to generate games, and 64 TPUv2 to train networks, during 8 hours. I still find it unfortunate that not much work has been done to reduce resources needed to train AlphaZero-like systems, which is already 5 years old

10

Thorusss t1_ir9oyt3 wrote

So? They had to train once, the more efficient algorithm is now in humanities toolbox till eternity. 10-20% increased speed can probably pay that back this year with the compute DeepMind uses alone.

1

Lairv t1_ira9aul wrote

My point is that considering that these methods can be applied in about any scientific field, it would be beneficial if not only Google, Microsoft, Facebook and OpenAI could train them

6