visarga t1_iwdt41n wrote on November 14, 2022 at 10:10 PM

Sometimes people say "Language models are like parrots. They learn patterns, but could never do something novel or surpass their training data."

This is proof that it is possible. What you need is to learn from validation. This process can be applied to math and code because complex solutions might have trivial validations.

When you don't have a symbolic way to validate the solution, you can ensemble a bunch of solutions and choose the one who appears most frequently.