Comment by organsnyder

6 months ago

Anthropic is selling a service that incorporates these pirated works.

That a service incorporating the authors' works exists is not at issue. The plaintiffs' claims are, as summarized by Alsup:

  First, Authors argue that using works to train Claude’s underlying LLMs 
  was like using works to train any person to read and write, so Authors 
  should be able to exclude Anthropic from this use (Opp. 16). 

  Second, to that last point, Authors further argue that the training was 
  intended to memorize their works’ creative elements — not just their 
  works’ non-protectable ones (Opp. 17).

  Third, Authors next argue that computers nonetheless should not be 
  allowed to do what people do. 

https://media.npr.org/assets/artslife/arts/2025/order.pdf

  • The first paragraph sounds absurd, so I looked into the PDF, and here's the full version I found:

    > First, Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16). But Authors cannot rightly exclude anyone from using their works for training or learning as such. Everyone reads texts, too, then writes new texts. They may need to pay for getting their hands on a text in the first instance. But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable. For centuries, we have read and re-read books. We have admired, memorized, and internalized their sweeping themes, their substantive points, and their stylistic solutions to recurring writing problems.

    Couldn't have put it better myself (though $deity knows I tried many times on HN). Glad to see Judge Alsup continues to be the voice of common sense in legal matters around technology.

    • > Glad to see Judge Alsup continues to be the voice of common sense in legal matters around technology

      Yep, that name's a blast from the past! He was the judge on the big Google/Oracle case about Android and Java years ago, IIRC. I think he even learned to write some Java so he could better understand the case.

    • For everyone arguing that there’s no harm in anthropomorphizing an LLM, witness this rationalization. They talk about training and learning as if this is somehow comparable to human activities. The idea that LLM training is comparable to a person learning seems way out there to me.

      “We have admired, memorized, and internalized their sweeping themes, their substantive points, and their stylistic solutions to recurring writing problems.”

      Claude is not doing any of these things. There is no admiration, no internalizing of sweeping themes. There’s a network encoding data.

      We’re talking about a machine that accepts content and then produces more content. It’s not a person, it’s owned by a corporation that earns money on literally every word this machine produces. If it didn’t have this large corpus of input data (copyrighted works) it could not produce the output data for which people are willing to pay money. This all happens at a scale no individual could achieve because, as we know, it is a machine.

      1 reply →

  • > That a service incorporating the authors' works exists is not at issue.

    It's not an issue because it's not currently illegal because nobody could have foreseen this years ago.

    But it is profiting off of the unpaid work of millions. And there's very little chance of change because it's so hard to pass new protection laws when you're not Disney.

    • It's not an issue because it's not what this case was about, as the linked document explicitly states. The Authors did not contest the legality of the model's outputs, only the inputs used in training.

      2 replies →

    • Marx wrote The tradition of all dead generations weighs like an Alp on the brains of the living. and that would be true if one were obligated to pay the full freight of one's antecedents. The more positive truth is that the brains of the living reach new heights from that Alp and build ever new heights for those who come afterwards.

  • Computers cannot learn and are not subjects to laws. What happens, is a human takes a copyrighted work, makes an unauthorized digital copy, and loads it into a computer without authorization from copyright owner.

  • > underlying LLMs was like using works to train any person to read and write

    I don't think humans learn via backprop or in rounds/batches, our learning is more "online".

    If I input text into an LLM it doesn't learn from that unless the creators consciously include that data in the next round of teaching their model.

    Humans also don't require samples of every text in history to learn to read and write well.

    Hunter S Thompson didn't need to ingest the Harry Potter books to write.