Comment by crote

8 hours ago

No, you're still just one clever prompt away from getting pwned. It's like trying to solve SQL injection by attempting to use an ever-increasing pile of regexes for "input validation", rather than just getting rid of string concatenation and using prepared statements instead.

3 comments

crote

Timwi 5 hours ago

What SQL system have you been using where just escaping a string requires “an ever-increasing pile of regexes”?

cowlby 7 hours ago

Im curious to see what that would look like. It’s like inception, how many levels deep can you create a prompt that hijacks all the way up.

fn-mote 6 hours ago

Modern OS exploit chains should give you a good sense of how far people can go. (Eg, phone OSes are relatively hardened.)
We’re not even at the “ASLR” level of protection for LLMs yet.