Comment by Terr_

19 days ago

Wouldn't your own LLM be overkill? Ideally one would generate decoy junk more much efficiently than these abusive/hostile attackers can steal it.

I still think this could worthwhile though for these reasons.

- One "quality" poisoned document may be able to do more damage - Many crawlers will be getting this poison, so this multiplies the effect by a lot - The cost of generation seems to be much below market value at the moment

I didn't run the text generator in real time (that would defeat the point of shifting cost to the adversary, wouldn't it?). I created and cached a corpus, and then selectively made small edits (primarily URL rewriting) on the way out.