← Back to context

Comment by nadermx

19 hours ago

Funny how the copyright industry was able to spin copyright infringment into the pejorative "stealing". If you still have the item, what was stolen?

Dowling v. United States, 473 U.S. 207 (1985): The Supreme Court ruled that the unauthorized sale of phonorecords of copyrighted musical compositions does not constitute "stolen, converted or taken by fraud" goods under the National Stolen Property Act

I still find the idea that "learning" from code is "stealing" kind of ridiculous.

  • The "learning" isn't learning really. I mean it might be, but if you define learning to be a human endeavor than AI can't learn.

    It's perfectly reasonable to say it's okay for humans to do something but not okay for a computer program to do the same thing. We don't have to equate AI to humans, that's a choice and usually a bad one.

    • It's also perfectly reasonable to say it's ok for a program or machine to do the same thing as a human. This has been the basis for the technological revolution since the dawn of technology.

      3 replies →

    • If one defines 'flying' to be a bird's endeavor, then humans can't fly.

      Now, if you'll excuse me, I need to catch a metal shuttle that chucks itself through the air on wings.

      2 replies →

  • Yes I guess there's also no such thing as stealing in torrents since the computer "learns" the data and returns it in a transcoded fashion so it's technically not a reproduction. Yes LLMs can reproduce passages from copyrighted works verbatim but that's only because it "learned" it and it's just telling you what it "knows".

    The mental calisthenics required to justify this stuff must be exhausting.

    • > The mental calisthenics required to justify this stuff must be exhausting.

      It's only exhausting if you think copyright ever reasonably settled the matter of ownership of knowledge and want to morally justify an incoherent set of outcomes that they personally favor. In practice it's primarily been a tool for the powerful party in any dispute to hammer others for disrupting their business model. I think that's pretty much the only way attempting to apply ownership semantics to knowledge or information can end up.

      1 reply →

    • This is a perfect example of 'begging the question'. Arriving at a conclusion from a fact assumed as true without evidence. Your reductio does not actually demonstrate that copyright applies to LLMs, because you did not demonstrate how transcoding is comparable to inference, just that LLMs can reproduce some passages from copyrighted works. You could also produce passages from copyrighted works by generating enough random sequences of words, but no one is arguing that is comparable to transcoding. That the people who do not share this conclusion are engaging in motivated reasoning is based only on your assumption and has no logical backing, and is therefore begging the question.

  • I think that it's absurd that we've jumped to the conclusion backpropagation in neural networks should be legally treated the same as human learning.

    I mean I don't think think I could find a better description for following the derivatives of error in reproducing a set of works as creating a "derivative work".

    • >> ... we've jumped to the conclusion backpropagation in neural networks should be legally treated the same as human learning.

      I agree. However, the reverse is also likely true, i.e., it cannot currently be denied that learning in humans is different from learning in artificial neural networks from the point of view of production of works that mix ideas/memes from several works processed/read. Surely, as the article says, copyright law talks exclusively about humans, not machines, not animals.

      2 replies →

  • I find it more ridiculous to equate the act of a human learning with for-profit AI training without recompense to the authors of the training material.

  • Learning, probably not.

    Copy/pasting at scale, yes

    • It is learning though. It’s not just copying the code.

      Code gets turned into tokens and then it learns the next most likely token.

      The issue that I see most people talk about it the scale at which is learnt.

      A human will learn from other people’s code but not from every persons code.

      10 replies →

    • Copy/pasting at scale is how tons of software has been written for a long time, or have we all forgotten the jokes people used to make about StackOverflow?

  • If you can set a copyright trap and an LLM reproduces it I think it's pretty clear cut that it's more than just "learning".

    I have seen LLMs do all sorts of crap which was clearly reproduction of training material.

    This is also why people are most impressed with how much better it is at reproducing boilerplate rather than, say, imaginative new ideas.

    • Remember last year (?) when one of the major AIs produced a bit of code that included Jeff Geerling's name in a comment?

Everybody has had a complete 180 in terms of copyright protections. Before, nobody cared about downloading music, movies, TV shows, or pirating games. Now, when the copyright law is affecting them, they are gungho about protecting these billion-dollar companies' copyrights.

  • Its not about "billion-dollar companies' copyrights", but also about voluntary copyleft free software. If I license my code under GPL I don't want other persons/companies just whitewash that code through LLMs and use it in their proprietary code.

    • I agree with this, and I think that it is an open question whether or not training on copyrighted material is considered transformative or not. However, someone said that thumbnails of full photos are considered transformative enough to allow fair use, and LLM training is (in my opinion) clearly more transformative than converting a picture to a thumbnail. But we will see how it plays out.

I don't think it's unreasonable to consider it stolen potential profit, but agreed that's not how they spin it

“Stolen” as in “profited on IP against terms and conditions of the license”.