Comment by 6gvONxR4sf7o
6 months ago
You skipped quotes about the other important side:
> But Alsup drew a firm line when it came to piracy.
> "Anthropic had no entitlement to use pirated copies for its central library," Alsup wrote. "Creating a permanent, general-purpose library was not itself a fair use excusing Anthropic's piracy."
That is, he ruled that
- buying, physically cutting up, physically digitizing books, and using them for training is fair use
- pirating the books for their digital library is not fair use.
> buying, physically cutting up, physically digitizing books, and using them for training is fair use
So Suno would only really need to buy the physical albums and rip them to be able to generate music at an industrial scale?
Yes! Training and generation are fair use. You are free to train and generate whatever you want in your basement for whatever purpose you see fit. Build a music collection, go ham.
If the output from said model uses the voice of another person, for example, we already have a legal framework in place for determining if it is infringing on their rights, independent of AI.
Courts have heard cases of individual artists copying melodies, because melodies themselves are copyrightable: https://www.hypebot.com/hypebot/2020/02/every-possible-melod...
Copyright law is a lot more nuanced than anyone seems to have the attention span for.
> Yes!
But Suno is definitely not training models in their basement for fun.
They are a private company selling music, using music made by humans to train their models, to replace human musicians and artists.
We'll see what the courts say but that doesn't sound like fair use.
39 replies →
> Copyright law is a lot more nuanced than anyone seems to have the attention span for.
Copyright is probably the wrong body of law for regulating AI companies.
If it's fair use to train a model, that doesn't necessarily imply that the model can be legally used to generate anything.
I've been reading a bit more about this. The training might not be considered fair use if it's not considered transformative.
Claude has been considered transformative given it's not really meant to generate books but Suno or Midjourney are absolutely in another category.
2 replies →
Well there was that legal company who trained an LLM on their oppositions legal documents and then generated their own. I dont think inputs or outputs were ruled legal in that regard.
But as long as the model isnt outputting infringing works theres not really any issue there either.
this is funny and potentially accurate
Not sure we can infer that (or anything) about Suno from this ruling. The judge here said that Anthropic's usage was extremely transformative. Would Suno's also be considered that way?
Anthropic doesn't take books and use them to train a model that is intended to generate new books. (Perhaps it could do that, to some extent, but that's no its [sole] purpose.)
But Suno would be taking music to train a model in order to generate new music. Is that transformative enough? We don't know what a judge thinks, at least not yet.
Only if the physical albums don't have copy protection, otherwise you're circumenventing it and that's illegal. Or is it, against the right to private copy? If anything, AI at least shows that all of the existing copyright laws are utter bullshit made to make Disney happy.
Do keep in mind though: this is only for the wealthy. They're still going to send the Pinkertons at your house if you dare copy a Blu-ray.
No, because they can just play the album for the AI to learn. AI training can be set up to exploit the analog hole. Same with images/movies
> They're still going to send the Pinkertons at your house if you dare copy a Blu-ray.
Hey woah now, that's a Hasbro play, not a Disney one.
With some minor exceptions, CDs don't have copy protection.
1 reply →
Same how it works in the Netherlands.
Yes.
Actually it remains to be seen.
If you read the ruling, training was considered fair use in part because Claude is not a book generation tool. Hence it was deemed transformative. Definitely not what Suno and Udio are doing.
So not only did they pirate works but they destroyed possibly collectible physical copies too. Kafkaesque.
Google set the precedent for this with an even less transformative use case: https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,....
As they mentioned, the piracy part is obvious. It's the fair use part that will set an important precedent for being able to train on copyrighted works as long as you have legally acquired a copy.
Cue physical books being licensed not sold in the futur with restricted agreements …
See first-sale doctrine <https://en.wikipedia.org/wiki/First-sale_doctrine>
Also music, videos, photos, etc.
So all they have to do is go and buy a copy of each book they pirated. They will have ceased and desisted.
I'm trying to find the quote, but I'm pretty sure the judge specifically said that going and buying the book after the fact won't absolve them of liability. He said that for the books they pirated they broke the law and should stand trial for that and they cannot go back and un-break in by buying a copy now.
Found it: https://www.nbcnews.com/tech/tech-news/federal-judge-rules-c...
> “That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft,” [Judge] Alsup wrote, “but it may affect the extent of statutory damages.”
Is copyright in America different to Britain? There, it is legal to download books you don't own. Only distribution is a crime, which most torrenters break by seeding.
8 replies →
They also argued that they in no way could ever actually license all the materials they ingested
32 replies →
Did they really steal if they didn't deprive anyone of their copy? I don't think copying is theft.
11 replies →
> So all they have to do is go and buy a copy of each book they pirated.
No, that doesn't undo the infringement. At most, that would mitigate actual damages, but actual damages aren't likely to be important, given that statutory damages are an alternative and are likely to dwarf actual damages. (It may also figure into how the court assigns statutory damages within the very large range available for those, but that range does not go down to $0.)
> They will have ceased and desisted.
"Cease and desist" is just to stop incurring additional liability. (A potential plaintiff may accept that as sufficient to not sue if a request is made and the potential defendant complies, because litigation is uncertain and expensive. But "cease and desist" doesn't undo wrongs and neutralize liability when they've already been sued over.)
> So all they have to do is go and buy a copy of each book they pirated.
For anyone else who wants to do the same thing though this is likely all they need to do.
Cutting up and scanning books is hard work and actually doing the same thing digitally to ebooks isn't labor free either, especially when they have to be downloaded from random sites and cleaned from different formats. Torrenting a bunch of epubs and paying for individual books is probably cheaper
Generally you don't want laws to work that way. You want to set the penalties so that they discourage violating the law.
Setting the penalty to what it would have cost to obey the law in the first place does the opposite.
That's for criminal laws where prosecutorial discretion can then (in principle) be used in borderline cases to prevent unjust outcomes.
If you give people a claim for damages which is an order of magnitude larger than their actual damages, it encourages litigiousness and becomes a vector for shakedowns because the excessive cost of losing pressures innocent defendants to settle even if there was a 90% chance they would have won.
Meanwhile both parties have the incentive to settle in civil cases when it's obvious who is going to win, because a settlement to pay the damages is cheaper than the cost of going to court and then having to pay the same damages anyway. Which also provides a deterrent to doing it to begin with, because even having to pay lawyers to negotiate a settlement is a cost you don't want to pay when it's clear that what you're doing is going to have that result.
And when the result isn't clear, penalizing the defendant in a case of first impression isn't just either, because it wasn't clear and punitive measures should be reserved for instances of unambiguous wrongdoing.
11 replies →
> That is, he ruled that
> - buying, physically cutting up, physically digitizing books, and using them for training is fair use
> - pirating the books for their digital library is not fair use.
That seems inconsistent with one another. If it's fair use, how is it piracy?
It also seems pragmatically trash. It doesn't do the authors any good for the AI company to buy one copy of their book (and a used one at that), but it does make it much harder for smaller companies to compete with megacorps for AI stuff, so it's basically the stupidest of the plausible outcomes.
These are two separate actions that Anthropic did:
* They downloaded a massive online library of pirated books that someone else was distributing illegally. This was not fair use.
* They then digitised a bunch of books that they physically owned copies of. This was fair use.
This part of the ruling is pretty much existing law. If you have a physical book (or own a digital copy of a book), you can largely do what you like with it within the confines of your own home, including digitising it. But you are not allowed to distribute those digital copies to others, nor are you allowed to download other people's digital copies that you don't own the rights to.
The interesting part of this ruling is that once Anthropic had a legal digital copy of the books, they could use it for training their AI models and then release the AI models. According to the judge, this counts as fair use (assuming the digital copies were legally sourced).
> This part of the ruling is pretty much existing law. If you have a physical book (or own a digital copy of a book), you can largely do what you like with it within the confines of your own home, including digitising it. But you are not allowed to distribute those digital copies to others, nor are you allowed to download other people's digital copies that you don't own the rights to.
Can you point me to the US Supreme Court case where this is existing law?
It's pretty clear that if you have a physical copy of a book, you can lend it to someone. It also seems pretty reasonable that the person borrowing it could make fair use of it, e.g. if you borrow a book from the library to write a book review and then quote an excerpt from it. So the only thing that's left is, what if you do the same thing over the internet?
Shouldn't we be able to distinguish this from the case where someone is distributing multiple copies of a work without authorization and the recipients are each making and keeping permanent copies of it?
6 replies →
The judge said they can train however I believe the judge did not make any ruling regarding model outputs
1 reply →
> You skipped quotes about the other important side:
He said:
> It was always somewhat obvious that pirating a library would be copyright infringement.
??
From my understanding:
> pirating the books for their digital library is not fair use.
"Pirating" is a fuzzy word and has no real meaning. Specifically, I think this is the cruz:
> without adding new copies, creating new works, or redistributing existing copies
Essentially: downloading is fine, sharing/uploading up is not. Which makes sense. The assertion here is that Anthropic (from this line) did not distribute the files they downloaded.
The legal context here is that "format shifting" has not previously been held to be sufficient for fair use on its own, and downloading for personal use has also been considered infringing. Just look at the numerous media industry lawsuits against individuals that only mention downloading, not sharing for examples.
It's a bit surprising that you can suddenly download copyrighted materials for personal use and and it's kosher as long as you don't share them with others.
> the numerous media industry lawsuits against individuals that only mention downloading,
I never saw any of these. All the cases I saw were related to people using torrents or other P2P software (which aren't just downloading). These might exist, but I haven't seen them.
> It's a bit surprising that you can suddenly download copyrighted materials for personal use and it's kosher as long as you don't share them with others.
Every click on a link is a risk of downloading copyrighted material you don't have the rights to.
Searching the internet, it appears that it's a civil infraction, but it's also confused with the notion that "piracy" is illegal, a term that's used for many different purposes. I see "It is illegal to download any music or movies that are copyrighted." under legal advice, which I know as a statement is not true.
Hence my confusion.
I should note: I'm not arguing from the perspective of whether it's morally or ethically right. Only that even in the context of this thread, things are phrased that aren't clear.
3 replies →
Downloading and using pirated software in a company is fine then as long as it is not shared outside? If what you describe is legal it makes no sense to pay for software.
sci-hub suddenly becomes legal if all researchers adhere to one big company, apparently.
After all, illegally downloading research papers in order to write new ones is highly transformative.
> Downloading a document is fine as long as it is not shared outside?
I've fixed your question so that it accurately represents what I said and doesn't put words in my mouth.
If I click on a link and download a document, is that illegal?
I do not know if the person has the right to distribute it or not. IANAL, but when people were getting sued by the RIAA years back, it was never about downloading, but also distribution.
As I said, IANAL, but feel free to correct me, but my understanding is that downloading a document from the internet is not illegal.
2 replies →
Given that downloading requires you to copy the data to download it, I'd think it would fall under "adding new copies".
> All Anthropic did was replace the print copies it had purchased ... with more convenient space-saving and searchable digital copies for its central library — without adding new copies..."
That suggests otherwise.
1 reply →