Comment by Retr0id

13 hours ago

The fact that LLMs are "smarter" is also their weakness. An oldschool classifier is far from foolproof, but you won't get past it by telling it about your grandma's bedtime story routine.

5 comments

Retr0id

reassess_blind 11 hours ago

Fairly hard to bypass the latest LLMs with grandma's bedtime story these days, to be fair.

Retr0id 11 hours ago
That specific trick yes, but the general concept still applies.
- reassess_blind 11 hours ago
  
  It does, but it's certainly not trivial. In fact there's an unclaimed $1000 bounty on prompt injecting OpenClaw: https://hackmyclaw.com/
  
  2 replies →