Viewing a single comment thread. View all comments

smallest_meta_review OP t1_ivcghme wrote

Oh, so one of the examples in the blog post is that we start with a DQN agent with a 3-layer CNN architecture and reincarnate another Rainbow agent with a ResNet architecture (Impala-CNN) using the QDagger approach for reincarnation. Once reincarnated, the ResNet Rainbow agent is further trained with RL to maximize reward. See the paper here for more details: https://openreview.net/forum?id=t3X5yMI_4G2

1