Comment by Aurornis

2 months ago

> If I am unable to convince you to stop meticulously training the tools of the oppressor (for a fee!) then I just ask you do so quietly.

I'm kind of fascinated by how AI has become such a culture war topic with hyperbole like "tools of the oppressor"

It's equally fascinating how little these comments understand about how LLMs work. Using an LLM for inference (what you do when you use Claude Code) does not train the LLM. It does not learn from your code and integrate it into the model while you use it for inference. I know that breaks the "training the tools of the oppressor" narrative which is probably why it's always ignored. If not ignored, the next step is to decry that the LLM companies are lying and are stealing everyone's code despite saying they don't.

27 comments

Aurornis

meowkit 2 months ago

We are not talking about inference.

The prompts and responses are used as training data. Even if your provider allows you to opt out they are still tracking your usage telemetry and using that to gauge performance. If you don’t own the storage and compute then you are training the tools which will be used to oppress you.

Incredibly naive comment.

Aurornis 2 months ago
> The prompts and responses are used as training data.
They show a clear pop-up where you choose your setting about whether or not to allow data to be used for training. If you don't choose to share it, it's not used.
I mean I guess if someone blindly clicks through everything and clicks "Accept" without clicking the very obvious slider to turn it off, they could be caught off guard.
Assuming everyone who uses Claude is training their LLMs is just wrong, though.
Telemetry data isn't going to extract your codebase.
- lukan 2 months ago
  
  "If you don't choose to share it, it's not used"
  I am curious where your confidence that this is true, is coming from?
  Besides lots of GPU's, training data seems the most valuable asset AI companies have. Sounds like strong incentive to me to secretly use it anyway. Who would really know, if the pipelines are set up in a way, if only very few people are aware of this?
  And if it comes out "oh gosh, one of our employees made a misstake".
  And they already admitted to train with pirated content. So maybe they learned their lesson .. maybe not, as they are still making money and want to continue to lead the field.
  
  19 replies →

biammer 2 months ago

I understand how these LLMs work.

I find it hard to believe there are people who know these companies stole the entire creative output of humanity and egregiously continually scrape the internet are, for some reason, ignoring the data you voluntarily give them.

> I know that breaks the "training the tools of the oppressor" narrative

"Narrative"? This is just reality. In their own words:

> The awards to Anthropic, Google, OpenAI, and xAI – each with a $200M ceiling – will enable the Department to leverage the technology and talent of U.S. frontier AI companies to develop agentic AI workflows across a variety of mission areas. Establishing these partnerships will broaden DoD use of and experience in frontier AI capabilities and increase the ability of these companies to understand and address critical national security needs with the most advanced AI capabilities U.S. industry has to offer. The adoption of AI is transforming the Department’s ability to support our warfighters and maintain strategic advantage over our adversaries [0]

Is 'warfighting adversaries' some convoluted code for allowing Aurornis to 'see a 1337x in productivity'?

Or perhaps you are a wealthy westerner of a racial and sexual majority and as such have felt little by way of oppression by this tech?

In such a case I would encourage you to develop empathy, or at least sympathy.

> Using an LLM for inference .. does not train the LLM.

In their own words:

> One of the most useful and promising features of AI models is that they can improve over time. We continuously improve our models through research breakthroughs as well as exposure to real-world problems and data. When you share your content with us, it helps our models become more accurate and better at solving your specific problems and it also helps improve their general capabilities and safety. We do not use your content to market our services or create advertising profiles of you—we use it to make our models more helpful. ChatGPT, for instance, improves by further training on the conversations people have with it, unless you opt out.

[0] https://www.ai.mil/latest/news-press/pr-view/article/4242822...

[1] https://help.openai.com/en/articles/5722486-how-your-data-is...

ben_w 2 months ago
> Is 'warfighting adversaries' some convoluted code for allowing Aurornis to 'see a 1337x in productivity'?
Much as I despair at the current developments in the USA, and I say this as a sexual minority and a European, this is not "tools of the oppressor" in their own words.
Trump is extremely blunt about who he wants to oppress. So is Musk.
"Support our warfighters and maintain strategic advantage over our adversaries" is not blunt, it is the minimum baseline for any nation with assets anyone else might want to annex, which is basically anywhere except Nauru, North Sentinel Island, and Bir Tawil.
- biammer 2 months ago
  
  > "Support our warfighters and maintain strategic advantage over our adversaries" is not blunt, it is the minimum baseline for any nation with assets anyone else might want to annex
  I think its gross to distill military violence as defending 'assets [others] might want to annex'.
  What US assets were being annexed when US AI was used to target Gazans?
  https://apnews.com/article/israel-palestinians-ai-technology...
  > Trump is extremely blunt about who he wants to oppress. So is Musk.
  > our adversaries" is not blunt
  These two thoughts seem at conflict.
  What 'assets' were being protected from annexation here by this oppressive use of the tool? The chips?
  https://www.aclu.org/news/privacy-technology/doritos-or-gun
  
  2 replies →