Comment by petilon

5 hours ago

Has it been proved in a court of law that it is a copyright violation?

In some cases if the model regurgitates the original material then that is clearly copyright violation, but if the model "learns" from the source material just like a human brain would then that's not a copyright violation.

10 comments

petilon

0xbadcafebee 5 hours ago

No, what was proved in court was that they downloaded and trained on millions of pirated books. The court said their use of books is fair use, but stealing them isn't.

I think we're going to see cases that find distillation is also fair use. You're using the competing model like a book. You pay for it, you use it (read it), it informs your model, but you aren't repeating/reselling what the model told you verbatim. Foreign labs may still run afoul of competing labs' Terms of Service, and they may also pay a settlement (or not, it's a different jurisdiction after all), but the damage is already done. Distillation will become uncontroversial when done legally.

inigyou 3 hours ago

Are LLMs even copyrightable? If not, no need to speculate fair use.

chpatrick 5 hours ago

Then distillation isn't a violation either by extension.

petilon 4 hours ago
I would agree, if they are inspecting static output of American AI models without using their compute resources.
- chpatrick 3 hours ago
  
  Scraping the internet for training is also using compute resources.
- platinumrad 3 hours ago
  
  Aren't they buying the use of these resources just like any other customer?

sleepybrett 3 hours ago

it's a 'too big to fail' model. Because they have a big swinging dick all the copyright and other restrictions they violated would nuke them from orbit so we can't actually hold them to account for it .... for some fucking reason.

MSFT_Edging 5 hours ago

> Has it been proved in a court of law that it is a copyright violation?

God I'm so tired of this.

The billion dollar companies have the ability to hire an army of lawyers to DDOS the legal system. They at most pay a slap-on-the-wrist fine as the cost of doing business.

pipes 3 hours ago
Ddos is a great framing of this :)
I'm extremely pro free markets etc, but the uncomfortable truth is anthropic stole the work of thousands of authors for profit. I think it will end one my favourite things in life: programming books.
- petilon 2 hours ago
  
  If you have ever made a painting and sold it, then you too profited from the work of thousands of artists. How so? Because your sense of what is art came from those who preceded you. You have seen the works of Picasso, Rembrandt, Monet and so on and your brain absorbed from their work, just like an LLM.
  If an LLM generalizes from thousands of authors then it is no different from what your brain does.