Viewing a single comment thread. View all comments

smurfpiss t1_ivaf5ia wrote

Not experienced With RL much, but how is that different than an algorithm going through training iterations?

In that case the parameters are tweaked from past learned parameters. What's the benefit of learning from another algorithm? Is it some kind of weird offspring of skip connections and transfer learning?

5

smallest_meta_review OP t1_ivaghqa wrote

Good question. The original blog post somewhat covers this:

> Imagine a researcher who has trained an agent A_1 for some time, but now wants to experiment with better architectures or algorithms. While the tabula rasa workflow requires retraining another agent from scratch, Reincarnating RL provides the more viable option of transferring the existing agent A1 to a different agent and training this agent further, or simply fine-tuning A_1.

But this is not what happens in research. For example, each time we are training a new agent to let say play an Atari game, we train it from scratch ignoring all the prior agents trained on that game. This work argues that why not reuse learned knowledge from the existing agent while training new agents (which may be totally different).

3

smurfpiss t1_ivah7ul wrote

So, transfer learning but with different architectures? That's pretty neat. Will give it a read thanks 😊

3

smallest_meta_review OP t1_ivam34g wrote

Yeah, or even across different classes of RL methods: reusing a policy for training a value-based RL (e.g, DQN) or model-based RL method.

3