gurenkagurenda t1_j8v4fqg wrote on February 17, 2023 at 3:59 AM

Reply to comment by anti-torque in ChatGPT is a robot con artist, and we’re suckers for trusting it by altmorty

> It can only search out the most common next word for the context asked.

This is not actually true. That was an accurate description of earlier versions of GPT, and is part of how ChatGPT and InstructGPT were trained, but ChatGPT and InstructGPT use reinforcement learning to teach the models to do more complex tasks based on human preferences.

Also, and this is more of a nitpick, but "next word" would be greedy search, and I'm pretty sure ChatGPT uses beam search, which looks multiple words ahead.

anti-torque t1_j8vbhkb wrote on February 17, 2023 at 5:04 AM

> to teach the models to do more complex tasks based on human preferences.

so... predictive

>Also, and this is more of a nitpick, but "next word" would be greedy search....

This is fair. "Word" is too simple a unit. It picks up phrases and maxims.

gurenkagurenda t1_j8vnao5 wrote on February 17, 2023 at 7:16 AM

>so... predictive

No, not in any but the absolute broadest sense of that word, which would apply to any model which outputs text. In particular, it is not "search out the most common next word", because "most common" is not the criterion it is being trained on. Satisfying the reward model is not a matter of matching a corpus. Read the article I linked.

[deleted] t1_j8xcncs wrote on February 17, 2023 at 4:58 PM

[removed]