Submitted by vidul7498 t3_11itl7g in MachineLearning
Francois Chollet's recent tweet where he states: (https://twitter.com/fchollet/status/1630241783111364608)
"The answer to "when should I use deep RL" is that you shouldn't -- you should reframe your problem as a supervised learning problem, which is the only thing that curve-fitting can handle. In all likelihood this applies to RLHF for LLMs."
The people at DeepMind and OpenAI still seem bullish on RL but I have seen this kind of sentiment among other big names in DL as well. The most common sentiment I've seen is that RL is only good for extremely specific scenarios, other than that Supervised Learning is a much better option.
What do you guys think, is RL doomed or is it the future? Also, would it be one day possible to apply RL to a more general range of problems or will it always be niche?
currentscurrents t1_jazwqft wrote
The reason you want to do RL is that there's problem scenarios where RL is the only way to learn the problem.
Unsupervised learning can teach a model to understand the world, and supervised learning can teach a model to complete a human-defined task. But reinforcement learning can teach a model to choose its own tasks to complete arbitrary goals.
Trouble is, the training signal in reinforcement learning is a lot smaller, so you need ridiculous amounts of training data. Current thinking is that you need to use unsupervised learning to learn a world model + RL to learn how to achieve goals inside that model. This combination has worked very well for things like DreamerV3.