Comment by indoordin0saur
8 days ago
After the AI companies just blatanty lying that they weren't hoovering up people's IP and art for training I assume they collect any and all data they can get their hands on for training. When it comes to the big AI players feeding their future models I 100% just assume that they suck up any data we send them. Am I cynical?
Most companies have legal agreements called "Zero Data Retention" with these providers that bans them from storing any data that you send them (we are one of those companies).
The difference here is that for the first time Anthropic have said that's not available for 'Mythos class' models.
Sure. That's what they say but does anyone actually trust what they say about where they are getting their data? They also had a legal requirement not to steal IP, they said they weren't and then it came out that both OpenAI and Anthropic were pirating mass amounts of media. When they said they weren't doing this they were knowingly lying. I'm quite certain that some (if not all big players) are retaining data from their customers despite explicit agreements not to do so. As a data engineer myself, I know how easy this is to do.