blazejd

blazejd OP t1_ix8k4wf wrote

>Sure, but the answer remains: what reward function do you use that encompasses understanding and communicating, on top of grammar?

I realize this doesn't directly answer your question, so might point is that we don't know the answer, but we should at least try to pursue it.

1

blazejd OP t1_ix8ibmh wrote

>For actually learning language, in the sense of using it to convey meaningful, appropriate information, which LLMs so far cannot do, maybe it's better to take an RL approach. But I don't know how to write a reward function that encompasses that. So as long as we can't do the superior thing with either approach, we might as well focus on the easier approach to the superficial thing.

My understanding of this paragraph simply put is (correct me if I'm wrong) "RL might be better, but we don't know how to do it, so let's not try. Language models are doing fine.".

In my opinion, in science we should focus simultaneously on easier problems that can lead to shorter-term gains (language models) AND ALSO more difficult problems that are riskier but might be better long term (RL-based).

1

blazejd OP t1_ix7mr03 wrote

Thank you everyone for your comments, they were really insightful and gave me some perspective I wouldn't have on my own. I am quite new to ML reddit so wasn't sure what to expect. Here is my quick summary/general reply.

Most of you agreed that we use language modelling because it is the most compute- and time-effective way and that's sort of the best thing we have *right now*, but RL would be interesting to incorporate. However, initializing solely with RL is difficult, including choosing a good objective.

This seems a bit similar to the hype about SVMs in early 2000s (from what I heard from senior researchers, it was a thing). Basically, back then we already had neural networks, but we weren't ready hardware/data-wise so at the time SVMs were performing better due to their simplicity but after 20 years we can clearly see neural nets was the right direction. It's easier to use language model now, they give better short-term performance, but in a couple decades probably RL will outperform them (although very likely multi-modality will be necessary).

A currently feasible step in this direction is merging the two concepts of language models and RL-based feedback. Some papers mentioned are: https://arxiv.org/abs/2203.02155 and "Experience Grounds Language" (although I didn't read them entirely yet). We could initialize a customer-facing chatbot with a language model and then update it RL-style which can be thought of as some form of online or continual learning. The RL objective could be the rating user gives after interacting with the system, the frequency of the use asking to talk to a human assistant or the sentiment of user replies (positive or negative). And if we could come up with that bouncing off ideas on reddit, then probably some company is already doing that.

If you are looking for more related resources, my thoughts were inspired by the field of language emergence (https://arxiv.org/pdf/2006.02419.pdf) and this work (https://arxiv.org/pdf/2112.11911.pdf).

3

blazejd OP t1_ix7k18e wrote

What language models are doing is indeed modelling language distribution, but what ML community wants them to be doing and what is the end goal is to create a model that learns to understand and communicate with a language. You can see that by the ways we try to evaluate the language, for example asking it to solve math equations

1

blazejd OP t1_ix7jekd wrote

I think u/Cheap_Meeting understood my question a bit better here. The end goal is to create an NLP model that learns to can understand and communicate in natural language. This is why currently the main NLP benchmarks cover many different tasks etc. We use language models because that's an easier approach, but not necessarily better.

1