Viewing a single comment thread. View all comments

EternalNY1 t1_je94zko wrote

If you want what I'd consider to be hands-down the best explanation of how it works, I'd read Stephen Wolfram's article. It's long (may take up to an hour) and somewhat dense at parts, but it explains fully how it works, including the training and everything else.

What Is ChatGPT Doing … and Why Does It Work?

The amazing thing is they've looked "inside" GPT-3 and have discovered mysterious patterns related to language that they have no explanation for.

The patterns look like this ... they don't understand the clumping of information yet.

So any time someone says "it just fills in the next likely token", that is beyond overly simplistic. The researches themselves don't fully understand some of the emergent behavior it is showing.