Comment by GaggiX

6 months ago

I imagine these companies today are curing their data with LLMs, this stuff isn't going to do anything.

13 comments

GaggiX

That opens up the opposite attack though: what do you need to do to get your content discarded by the AI?

I doubt you'd have much trouble passing LLM-generated text through their checks, and of course the requirements for you would be vastly different. You wouldn't need (near) real-time, on-demand work, or arbitrary input. You'd only need to (once) generate fake doppelganger content for each thing you publish.

If you wanted to, you could even write this fake content yourself if you don't mind the work. Feed Open AI all those rambling comments you had the clarity not to send.

botanical76 6 months ago

You're right, this approach is too easy to spot. Instead, pass all your blog posts through an LLM to automatically inject grammatically sound inaccuracies.

GaggiX 6 months ago
Are you going to use OpenAI API or maybe setup a Meta model on an NVIDIA GPU? Ahah
Edit: I found it funny to buy hardware/compute to only fund what you are trying to stop.
- botanical76 6 months ago
  
  I suppose you are making a point about hypocrisy. Yes, I use GenAI products. No, I do not agree with how they have been trained. There is nothing individuals can do about the moral crimes of huge companies. It's not like refusing to use a free Meta LLama model constitutes as voting with your dollars.

sangnoir 6 months ago

> I imagine these companies today are curing their data with LLMs, this stuff isn't going to do anything

The same LLMs tag are terrible at AI-generated-content detection? Randomly mangling words may be a trivially detectable strategy, so one should serve AI-scraper bots with LLM-generated doppelganger content instead. Even OpenAI gave up on its AI detection product

walterbell 6 months ago

Attackers don't have a monopoly on LLM expertise, defenders can also use LLMs for obfuscation.

Technology arms races are well understood.

GaggiX 6 months ago
I hate LLM companies, I guess I'm going to use OpenAI API to "obfuscate" the content or maybe I will buy an NVIDIA GPU to run a llama model, mhm maybe on GPU cloud.
- walterbell 6 months ago
  
  With tiny amounts of forum text, obfuscation can be done locally with open models and local inference hardware (NPU on Arm SoC). Zero dollars sent to OpenAI, NVIDIA, AMD or GPU clouds.
  
  4 replies →