Viewing a single comment thread. View all comments

Disastrous_Elk_6375 t1_j9nrm6w wrote

> It does memorize short snippets in some cases (especially when a snippet is repeated many times in training data)

And, to be fair, how can it not? How many different ways can you write a simple for loop to print some objects, or match a regex, call an API, and so on?

5

visarga t1_j9qxgt2 wrote

If you go down to individual words or characters, everything is reused. If you go up, usually a random 10 word snippet is nowhere else in the internet. But boilerplate and basic things might be replicated in all shapes and forms.

1