Viewing a single comment thread. View all comments

visarga t1_j6arwxp wrote

Generating data through RL like AlphaGo or "Evolution through Large Models" (ELM) seems to show a way out. Not all data is equally useful for the model, for example problem and task solving is more important that raw organic text.

Basically use LLM to generate and another system to evaluate, in order to filter the useful data examples.

5