Viewing a single comment thread. View all comments

audioen t1_jdujtbl wrote

Yes. Directly predicting the answer in one step from a question is a difficult ask. Decomposing the problem to discrete steps, and writing out these steps and then using these sub-answers to compose the final result is evidently simpler and likely requires less outright memorization and depth in network. I think it is also how humans work out answers, we can't just go from question to answer unless the question is simple or we have already memorized the answer.

Right now, we are asking the model to basically memorize everything, and hoping it generalizes something like cognition or reasoning in the deep layers of the network, and to degree this happens. But I think it will be easier to engineer good practical Q&A system by being more intelligent about the way LLM is used, perhaps just by recursively querying itself or using the results of this kind of recursive querying to generate vast synthetic datasets that can be used to train new networks that are designed to perform some kind of LLM + scratchpad for temporary results = answer type behavior.

One way to do it today with something like GPT4 might be to just ask it to write its own prompt. When the model gets the human question, the first prompt actually executed by AI could be "decompose the user's prompt to a simpler, easier to evaluate subtasks if necessary, then perform these subtasks, then respond".

3