visarga t1_j2y2h2v wrote on January 4, 2023 at 7:21 PM

AI will surpass humans in all domains where it can generate problem solving data. AlphaZero did it. Trained in self-play and beat humans. No imitation, no human data at all.

What we need is to set up challenges, problems, tasks or games for the language model to play at. And test when it does well, and add those solutions to the training set. It will be a loop of self improvement by problem solving. The learning signal is provided by validation, so it doesn't depend on our data or manual work. It can even generate its own challenges.

More recently AlphaTensor found a better way to do matrix multiplication. Humans tried their hand for decades at this task, and in the end the AI surpassed all of us. Why? Massive search + verification + learning = a "smart brute forcing" approach.