Comment by vintagedave
8 hours ago
Almost every comment (five) so far is against this: 'An incredibly cynical attempt at spin', 'How dare the New York Times demand access to our vault of everything-we-keep to figure out if we're a bunch of lying asses', etc.
In direct contrast: I fully agree with OpenAI here. We can have a more nuanced opinion than 'piracy to train AI is bad therefore refusing to share chats is bad', which sounds absurd but is genuinely how one of the other comments follows logic.
Privacy is paramount. People _trust_ that their chats are private: they ask sensitive questions, ones to do with intensely personal or private or confidential things. For that to be broken -- for a company to force users to have their private data accessed -- is vile.
The tech community has largely stood against this kind of thing when it's been invasive scanning of private messages, tracking user data, etc. I hope we can collectively be better (I'm using ethical terms for a reason) than the other replies show. We don't have to support OpenAI's actions in order to oppose the NYT's actions.
I suspect that many of those comments are from the Philosopher's Chair (aka bathroom), and are not aspiring to be literal answers but are ways of saying "OpenAI Bad". But to your point there should be privacy preserving ways to comply, like user anonymization, tailored searches and so on. It sounds like the NYT is proposing a random sampling of user data. But couldn't they instead do a random sampling of their most widely read articles, for positive hits, rather than reviewing content on a case by case basis?
I hadn't heard of the philosopher's chair before, but I laughed :) Yes, I think those views were one-sided (OpenAI Bad) without thinking through other viewpoints.
IMO we can have multiple views over multiple companies and actions. And the sort of discussions I value here on HN are ones where people share insight, thought, show some amount of deeper thinking. I wanted to challenge for that with my comment.
_If_ we agree the NYT even has a reason to examine chats -- and I think even that should be where the conversation is -- I agree that there should be other ways to achieve it without violating privacy.
> The tech community has largely stood against this kind of thing when it's been invasive scanning of private messages, tracking user data
The tech community has been doing the scanning and tracking.
OpenAI is the one who chose to store the information. Nobody twisted their arm to do so.
If you store data it can come up in discovery during lawsuits and criminal cases. Period.
E.g., storing illegal materials on Google Drive, Google WILL turn that over to the authorities if there’s a warrant or lawsuit that demands it in discovery.
E.g., my CEO writes an email telling the CFO that he doesn’t want to issue a safety recall because it’ll cost too much money. If I sue the company for injuring me through a product they know to be defective, that civil suit subpoena can ask for all emails discussing the matter and there’s no magical wall of privacy where the company can just say “no that’s private information.”
At the same time, I don’t get to trawl through the company’s emails and use some email the CEO flirting with their secretary as admissible evidence.
There are many ways the court is able to ensure privacy for the individuals. Sexual assault victims don’t have their evidence blasted across the the airwaves just because the court needs to examine that physical evidence.
The only way to avoid this is to not collect the data in the first place, which is where end to end encryption with user-controlled keys or simply not collecting information comes into play.
> In direct contrast: I fully agree with OpenAI here. We can have a more nuanced opinion than 'piracy to train AI is bad therefore refusing to share chats is bad', which sounds absurd but is genuinely how one of the other comments follows logic.
These chats only need to be shared because:
- OpenAI pirated masses of content in the first place
- OpenAI refuse to own up to it even now (they spin the NYT claims as "baseless").
I don't agree with them giving my chats out either, but the blame is not with the NYT in my opinion.
> We don't have to support OpenAI's actions in order to oppose the NYT's actions.
Well the NYT action is more than just its own. It will set a precedent if they win which means other news outlets can get money from OpenAI as well. Which makes a lot of sense, after all they have billions to invest in hardware, why not in content??
And what alternative do they have? Without OpenAI giving access to the source materials used (I assume this was already asked for because it is the most obvious route) there is not much else they can do. And OpenAI won't do that because it will prove the NYT point and will cause them to have to pay a lot to half the world.
It's important that this case is made, not just for the NYT but for journalism in general.