randomkolmogorov OP t1_iz9z8av wrote on December 7, 2022 at 3:59 PM

Reply to comment by UnusualClimberBear in [Discussion] Suggestions on Trust Region Methods For Natural Gradient by randomkolmogorov

Thank you, this talk is very helpful. I was thinking about the formulation in terms of the natural gradient but adapting the approach in TRPO to my case seems like a good idea.

UnusualClimberBear t1_iza00fp wrote on December 7, 2022 at 4:04 PM

TRPO is often too slow for applications because of that line search and researchers often prefer to use PPO, which also has some guarantees in terms of KL on the state distribution and is faster. I'd be curious to hear about your problem if it ends up that TRPO is the best choice.

randomkolmogorov OP t1_iza7hwl wrote on December 7, 2022 at 4:54 PM

I am not really doing RL but rather aleatoric uncertainty quantification where I need to optimize over a manifold of functions. My distributions are much more manageable than if I were doing policy gradient so I have a feeling that with some cleverness it might be possible to sidestep a lot of the complications in TRPO but use the same ideas in the paper.