Comment by realusername

3 months ago

> NYTimes has produced credible evidence that OpenAI is simply stealing and republishing their content. The question they have to answer is "to what extent has this happened?"

Credible to whom? In their supposed "investigation", they sent a whole page of text and complex pre-prompting and still failed to get the exact content back word for word. Something users would never do anyways.

And that's probably the best they've got as they didn't publish other attempts.

5 comments

realusername

mikkupikku 3 months ago

Agreed, they could carefully coerce the model to more or less output some of their articles, but the premise that users were routinely doing this to bypass the paywall is silly.

terminalshort 3 months ago
Especially when you can just copy paste the url into Internet Archive and read it. And yet they aren't suing Internet Archive.
- acdha 3 months ago
  
  Copyright law isn’t binary and has long-running allowances for fair use which take into consideration factors like scale, revenue, and whether it replaces the original. As a real non-profit, the Internet Archive is not selling its copies of the NYT and it’s always giving full credit to the source. In contrast, ChatGPT does charge for their output and while it may give citations that’s not a given.
- realusername 3 months ago
  
  Let's be real, they are suing OpenAI because they have way more money than the Internet Archive and they would be happy with a cut