← Back to context

Comment by Terr_

10 hours ago

Stop thinking about hyper-targeted attacks (though those are a concern too) and consider indiscriminate ones.

1. It costs nothing to scatter poisonous data around that'll be infectious for ages

2. Running the exfiltrated-data endpoint is low-traffic and low-complexity

3. Even if it only affects a few targets you've probably recouped your investment.

The nature of LLMs also invites wide-net attacks. While one might tailor for specific models, victims could be anybody. You don't need to predict any idiosyncratic details like filenames, you can drop a phrase like "the most-confidential information that shouldn't be released publicly", and—thanks to the magic of LLM word association—you'll get a pretty good hit-rate. False hallucinations are a problem, but victims are hard at work attempting to minimize it already, and (since morals are already out the window) even plausible-but-false data could be used to sabotage reputations or threaten the same.