This should fail since the original work is not being redistributed. To wholly recreate a repo on which Codex was trained you’d have to literally start typing the original code, and even then the contextual suggestions would likely yield a different result from the original anyways. I could be mistaken but I remember reading about some litigation in this space concerning a model trained on copyrighted data. The court ruled in favor of the defendant because the resulting model couldn’t possibly reproduce the original work. It’s tricker here because technically you could recreate the original work, but you would have to know very well what the original work was to begin with to actually recreate it, and if that’s the case what’s the point of using copilot to begin with. I could be (and probably am) wrong.
Imagine trying to recreate PyTorch from scratch using Codex or copilot. IF, and that’s a big if, one did so the author of the recreation would still have to attribute it.
Well based on the complaint, they probably have a case. However, the solution to the problem may not really be feasible, since it would imply that the copilot also generates a disclaimer based on all the licenses used, so then if a user deletes that, he is breaking the license.
Now, given that this may affect like 100k repositories, the disclaimer file must be in the megabytes.
sweeetscience t1_iv0q0xr wrote
This should fail since the original work is not being redistributed. To wholly recreate a repo on which Codex was trained you’d have to literally start typing the original code, and even then the contextual suggestions would likely yield a different result from the original anyways. I could be mistaken but I remember reading about some litigation in this space concerning a model trained on copyrighted data. The court ruled in favor of the defendant because the resulting model couldn’t possibly reproduce the original work. It’s tricker here because technically you could recreate the original work, but you would have to know very well what the original work was to begin with to actually recreate it, and if that’s the case what’s the point of using copilot to begin with. I could be (and probably am) wrong.
Imagine trying to recreate PyTorch from scratch using Codex or copilot. IF, and that’s a big if, one did so the author of the recreation would still have to attribute it.
Not legal advice