Viewing a single comment thread. View all comments

gurenkagurenda t1_j8vnao5 wrote

>so... predictive

No, not in any but the absolute broadest sense of that word, which would apply to any model which outputs text. In particular, it is not "search out the most common next word", because "most common" is not the criterion it is being trained on. Satisfying the reward model is not a matter of matching a corpus. Read the article I linked.