Viewing a single comment thread. View all comments

smallest_meta_review OP t1_ivjle6n wrote

Thanks for your informative reply. If interested, we have previously applied results from self-distillation to show that implicit regularization can actually lead to capacity loss in RL as bootstrapping can be viewed as self-distillation: https://drive.google.com/file/d/1vFs1FDS-h8HQ1J1rUKCgpbDlKTCZMap-/view?usp=drivesdk

1