Comment by realusername
6 hours ago
> NYTimes has produced credible evidence that OpenAI is simply stealing and republishing their content. The question they have to answer is "to what extent has this happened?"
Credible to whom? In their supposed "investigation", they sent a whole page of text and complex pre-prompting and still failed to get the exact content back word for word. Something users would never do anyways.
And that's probably the best they've got as they didn't publish other attempts.
Agreed, they could carefully coerce the model to more or less output some of their articles, but the premise that users were routinely doing this to bypass the paywall is silly.
Especially when you can just copy paste the url into Internet Archive and read it. And yet they aren't suing Internet Archive.
Copyright law isn’t binary and has long-running allowances for fair use which take into consideration factors like scale, revenue, and whether it replaces the original. As a real non-profit, the Internet Archive is not selling its copies of the NYT and it’s always giving full credit to the source. In contrast, ChatGPT does charge for their output and while it may give citations that’s not a given.