Comment by jcranmer
8 hours ago
> In copyright cases, typically you need to show some kind of harm.
NYT is suing for statutory copyright infringement. That means you only need to demonstrate that the copyright infringement, since the infringement alone is considered harm; the actual harm only matters if you're suing for actual damages.
This case really comes down to the very unsolved question of whether or not AI training and regurgitation is copyright infringement, and if so, if it's fair use. The actual ways the AI is being used is thus very relevant for the case, and totally within the bounds of discovery. Of course, OpenAI has also been engaging this lawsuit with unclean hands in the first place (see some of their earlier discovery dispute fuckery), and they're one of the companies with the strongest "the law doesn't apply to US because we're AI and big tech" swagger.
NYT doesn't care about regurgitation. When it was doable, it was spotty enough that no one would rely on it. But now the "trick" doesn't even work anymore (you would paste the start of an article and chatgpt would continue it).
What they want is to kill training, and more over, prevent the loss of being the middle-man between events and users.
> prevent the loss of being the middle-man between events and users
I'm confused by this phrase. I may be misreading but it sounds like you're frustrated, or at least cynical about NYT wanting to preserve their business model of writing about things that happen and selling the publication. To me it seems reasonable they'd want to keep doing that, and to protect their content from being stolen.
They certainly aren't the sole publication of written content about current events, so calling them "the middle-man between events and users" feels a bit strange.
If your concern is that they're trying to prevent OpenAI from getting a foot in the door of journalism, that confuses me even more. There are so, so many sources of news: other news agencies, independent journalists, randos spreading word-of-mouth information.
It is impossible for chatgpt to take over any aspect of being a "middle-man between events and users" because it can't tell you the news. it can only resynthesize journalism that it's stolen from somewhere else, and without stealing from others, it would be worse than the least reliable of the above sources. How could it ever be anything else?
This right here feels like probably a good understanding of why NYT wants openai to keep their gross little paws off their content. If I stole a newspaper off the back of a truck, and then turned around and charged $200 a month for the service of plagiarizing it to my customers, I would not be surprised if the Times's finest lawyers knocked on my door either.
Then again, I may be misinterpreting what you said. I tend to side with people who sue LLM companies for gobbling up all their work and regurgitating it, and spend zero effort trying to avoid that bias
> What they want is to kill training, and more over, prevent the loss of being the middle-man between events and users.
So... they want to continue reporting news, and they don't want their news reports to be presented to users in a place where those users are paying someone else and not them. How horrible of them?
If NYT is not reporting news, then NYT news reports will not be available for AIs to ingest. They can perhaps still get some of that data from elsewhere, perhaps from places that don't worry about the accuracy of the news (or intentionally produces inaccurate news). You have to get signal from somewhere, just the noise isn't enough, and killing off the existing sources of signal (the few remaining ones) is going to make that a lot harder.
The question is, does journalism have a place in a world with AIs, and should OpenAI be the one deciding the answer to that question?
It sounds like the defendant would much prefer middle-men who do not have the resources to enforce copyright.
> prevent the loss of being the middle-man between events and users.
OpenAI is free to do own reporting. NY Times is nowhere near trying to prevent others for competing as middleman.
It’s more than middle man right? Like if visits to NYT reduce then they get less ads revenue and their ability to do business goes away. On the other hand, if they demand licensing fees then they’ll just be marginalized by other news anyways.