← Back to context

Comment by indoordin0saur

4 days ago

After the AI companies just blatanty lying that they weren't hoovering up people's IP and art for training I assume they collect any and all data they can get their hands on for training. When it comes to the big AI players feeding their future models I 100% just assume that they suck up any data we send them. Am I cynical?

> When it comes to the big AI players feeding their future models I 100% just assume that they suck up any data we send them. Am I cynical?

There is a reason enterprise contracts and plans exist. And I think even on that account we're going to find out at some point that LLMs are training on that extremely useful data.

I think it's very likely. This is the reason why I stay on GitHub Copilot business for the time being as a solo developer. I assume that Microsoft has less incentive than Antrophic to break the business agreement and use data for training or re-sell it to Antrophic. If I was using the heavily discounted subscription plan from Antrophic, I would 100% assume everything is fed to the machine. I'd rather pay whatever the API costs, than give it an exact recipe to build my product.

and you can't opt out of data retention for non-training purposes. so I think theres a bit of a psyop occurring here.