Viewing a single comment thread. View all comments

micaroma t1_j3469ny wrote

"But that is also beside the point, there was an improvement while the sessions lasted."

Really? That seems like the most important factor of "self-improvement". If it only improved its error in the session but makes the same error if you refresh the page, then it didn't improve itself, it simply improved its output. There's a huge difference between permanently upgrading your own capabilities from external input, and simply fixing text already written on the page with external input.

(Also, it sometimes continues to make the same error within the same session even after pointing out its mistake, which is greater evidence against true self-improvement.)


visarga t1_j34sgk5 wrote

I don't see the problem. The language model can have feedback from code execution. If it is about facts, it could have access to a search engine. But the end effect is that it will be much more correct. A search engine provides grounding and has fresh data. As long as you can fit the data/code execution results in the prompt, all is ok.

But if we save the correctly executed tasks and problems we could make a new dataset to be used in fine-tuning the model. So it could learn as well.