Viewing a single comment thread. View all comments

ciarenni t1_j5m0jn5 wrote

> I heard that chatGPT when as to code something was just basically scraping from GitHub. At what point does an AI infringe in copyright and who is responsible.

Microsoft has already done this. Here's the short version.

A few years back, Microsoft bought GitHub. Repositories on GitHub have a license, specified by the author, stating how they can be used. These licenses range from "lol, do whatever, I don't care, but don't expect any support from me", to something akin to standard copyright.

Microsoft also creates Visual Studio, a program for writing code with lots of niceties to help people develop more efficiently and easily than writing code in notepad.exe. A recent version of Visual Studio had a feature called "co-pilot" which will basically read the half-built code you have and use some machine learning to offer suggestions.

Now then, as an exercise for the reader, knowing that Microsoft owns GitHub and also Visual Studio, where do you think they got the data to train that ML model? If you guessed "from the code on GitHub", you'd be right! And bonus points if you followed up with "but wait, surely they only used code they were allowed to based on the license specified?" Hint: No. It's literally plagiarism.

21

Nebuli2 t1_j5pwvqt wrote

Yep, so they just let you know that they pass off any responsibility for infringing on licenses to you, the user.

0