Comment by jillesvangurp

9 hours ago

AIs are not human and therefore their output is a human authored contribution and only human authored things are covered by copyright. The work might hypothetically infringe on other people's copyright. But such an infringement does not happen until a human decides to create and distribute a work that somehow integrates that generated code or text.

The solution documented here seems very pragmatic. You as a contributor simply state that you are making the contribution and that you are not infringing on other people's work with that contribution under the GPLv2. And you document the fact that you used AI for transparency reasons.

There is a lot of legal murkiness around how training data is handled, and the output of the models. Or even the models themselves. Is something that in no way or shape resembles a copyrighted work (i.e. a model) actually distributing that work? The legal arguments here will probably take a long time to settle but it seems the fair use concept offers a way out here. You might create potentially infringing work with a model that may or may not be covered by fair use. But that would be your decision.

For small contributions to the Linux kernel it would be hard to argue that a passing resemblance of say a for loop in the contribution to some for loop in somebody else's code base would be anything else than coincidence or fair use.

23 comments

jillesvangurp

friendzis 1 hour ago

> Is something that in no way or shape resembles a copyrighted work (i.e. a model) actually distributing that work?

Does a digitally encoded version resemble a copyrighted work in some shape or form? </snark>

Where is this hangup on models being something entirely different than an encoding coming from? Given enough prodding they can reproduce training data verbatim or close to that. Okay, given enough prodding notepad can do that too, so uncertainty is understandable.

This is one of the big reasons companies are putting effort into the so called "safety": when the legal battles are eventually fought, they would have an argument that they made their best so that the amount of prodding required to extract any information potentially putting them under liability is too great to matter.

nitwit005 8 hours ago

That you can't copyright the AI's output (in the US, at least), doesn't imply it doesn't contain copyrighted material. If you generate an image of a Disney character, Disney still owns the copyright to that character.

metalcrow 17 minutes ago

You can copyright AI output assuming there is a "reasonable" degree of human involvement. https://www.cnet.com/tech/services-and-software/this-company...
NitpickLawyer 2 hours ago

> That you can't copyright the AI's output (in the US, at least),
It's also not really clear if you can or cannot copyright AI output. The case that everyone cites didn't even reach the point where courts had to rule on that. The human in that case decided to file the copyright for an AI, and the courts ruled that according to the existing laws copyright must be filed by a person/human/whatever.
So we don't yet have caselaw where someone used AIgen and claimed the output as written by them.
fxtentacle 4 hours ago

Yes. And that’s why the rules say that the human submitting the code is responsible for preventing this case.

ninjagoo 8 hours ago

IANAL; this is what my limited understanding of the matter is. With that caveat: it is easy to forget that copyright is on output- verbatim or exact reproductions and derivatives of a covered work are already covered under copyright.

So if the AI outputs Starry Night or Starry Night in different color theme, that's likely infringement without permission from van Gogh, who would have recourse against someone, either the user or the AI provider.

But a starry-night style picture of an aquarium might not be infringing at all.

>For small contributions to the Linux kernel it would be hard to argue that a passing resemblance of say a for loop in the contribution to some for loop in somebody else's code base would be anything else than coincidence or fair use.

I would argue that if it was a verbatim reproduction of a copyrighted piece of software, that would likely be infringing. But if it was similar only in style, with different function names and structure, probably not infringing.

Folks will argue that some things might be too small to do any different, for example a tiny snippet like python print("hello") or 1+1=2 or a for loop in your example. In that case it's too lacking in original expression to qualify for copyright protection anyway.

Lerc 8 hours ago

>AIs are not human and therefore their output is a human authored contribution and only human authored things are covered by copyright.

That is a non sequitur. Also, I'm not sure if copyright applies to humans, or persons (not that I have encountered particularly creative corporations, but Taranaki Maunga has been known for large scale decorative works)

Sharlin 4 hours ago

Copyright applies to legal persons, that's why corporations can have copyright at all.
direwolf20 1 hour ago

A "large scale decorative work" is the strangest euphemism for a dormant volcano I've ever heard.

mcv 8 hours ago

Didn't a court in the US declare that AI generated content cannot be copyrighted? I think that could be a problem for AI generated code. Fine for projects with an MIT/BSD license I suppose, but GPL relies on copyright.

However, if the code has been slightly changed by a human, it can be copyrighted again. I think.

simonw 7 hours ago
Thaler v. Perlmutter said that an AI system cannot be listed as the sole author of a work - copyright requires a human author.
US Copyright Office guidance in 2023 said work created with the help of AI can be registered as long as there is "sufficient human creative input". I don't believe that has ever been qualified with respect to code, but my instinct is that the way most people use coding agents (especially for something like kernel development) would qualify.
- davemp 4 hours ago
  
  Interesting. That seems to suggest that one would need to retain the prompts in order to pursue copyright claims if a defendant can cast enough doubt on human authorship.
  Though I guess such a suit is unlikely if the defendant could just AI wash the work in the first place.
tadfisher 7 hours ago
No, a court did not declare that. The case involved a person trying to register a work with only the AI system listed as author. The Supreme Court decided that you can't do that, you need to list a human being as author to register a work with the Copyright Office. This stems from existing precedent where someone tried to register a photograph with the monkey photographer listed as author.
I don't believe the idea that humans can or can't claim copyright over AI-authored works has been tested. The Copyright Office says your prompt doesn't count and you need some human-authored element in the final work. We'll have to see.
- papercrane 5 hours ago
  
  It's almost a certainty that you can't copyright code that was generated entirely by an AI.
  Copyright requires some amount of human originality. You could copyright the prompt, and if you modify the generated code you can claim copyright on your modifications.
  The closest applicable case would be the monkey selfie.
  https://en.wikipedia.org/wiki/Monkey_selfie_copyright_disput...
  
  2 replies →
- manwe150 5 hours ago
  
  I’m curious to see if subscription vs free ends up mattering here. If it is a work for hire, generally it doesn’t matter how the work was produced, the end result is mine, because I contracted and instructed (prompted?) someone to do it for me. So will the copyright office decide it cares if I paid for the AI tool explicitly?
RussianCow 7 hours ago
> Didn't a court in the US declare that AI generated content cannot be copyrighted?
No, my understanding is that AI generated content can't be copyrighted by the AI. A human can still copyright it, however.
- Sharlin 4 hours ago
  
  It's obvious that a computer program cannot have copyright because computer programs are not persons in any currently existing jurisdiction.
  Whether a person can claim copyright of the output of a computer program is generally understood as depending on whether there was sufficient creative effort from said person, and it doesn't really matter whether the program is Photoshop or ChatGPT.
  
  3 replies →
singpolyma3 5 hours ago

Public domain code is GPL compatible