Submitted by vidul7498 t3_11itl7g in MachineLearning
thiru_2718 t1_jb6njez wrote
Reply to comment by currentscurrents in To RL or Not to RL? [D] by vidul7498
>supervised learning can teach a model to complete a human-defined task. But reinforcement learning can teach a model to choose its own tasks to complete arbitrary goals.
Isn't this contradicted by LLMs demonstrating emergent abilities (like learning how meta-learning strategies, or in-context learning) that allow it to tackle complex sequential tasks adaptively? There is research (i.e. https://innermonologue.github.io/) where LLMs are successfully applied to a traditional RL domain - planning and interaction for robots. While there is RLHF involved in models like ChatGPT, the bulk of the model's reasoning comes from the supervised learning.
As far as I can tell, the unexpected, emergent abilities of LLM have somewhat rewritten our assumptions of what is capable through supervised learning, and should be extended into the RL domain.
Viewing a single comment thread. View all comments