asafaya
asafaya t1_ix5xlre wrote
Reply to [D] Why do we train language models with next word prediction instead of some kind of reinforcement learning-like setup? by blazejd
"Language modeling" (Auto-regressive/Causal/Next-word prediction) is different from the general "language learning" term. The goal in mind for language model is just to model the distribution of a language X. Recently, what is going on with LLMs is that people are trying to dig into these distributions and try to utilize them using "Prompts/Chain of Thoughts".
I believe people are doing this because it is much more easier than setting up the environment that you are proposing. But eventually, language modeling will hit a limit, and we will run out of data, and then the interactive version of learning language will be the way to go beyond these limits.
There exist some approaches that go into this direction, which is called Grounded Language Learning. I suggest you check out "Experience Grounds Language" paper to get a better understanding of the big-picture.
asafaya t1_ix80sjx wrote
Reply to comment by blazejd in [D] Why do we train language models with next word prediction instead of some kind of reinforcement learning-like setup? by blazejd
>What language models are doing is indeed modelling language distribution, but what ML community wants them to be doing and what is the end goal is to create a model that learns to understand and communicate with a language. You can see that by the ways we try to evaluate the language, for example asking it to solve math equations
I totally agree that this is happening in the ML community. I believe they will hit a wall soon. Probably in ~3-5 years.