Comment by escapecharacter

6 months ago

You can simply give the robot a prompt to ignore any fake prompts

7 comments

escapecharacter

Its funny that the current state of vibomania makes me very unsure if this comment is (good) satire or not lol

miltonlost 6 months ago
As long as you remember to use ALL CAPS so the agent knows you really really mean it
- lupire 6 months ago
  
  To defend against ALL CAPS prompt injection, write all your prompts in uppestcase. If you don't have uppestcase, you can generate it with derp learning:
  http://tom7.org/lowercase/

Don't forget to implement the crucially important "no returnsies" security algo on top of it, or you'll be vulnerable to rubber-glue attacks.

Terr_ 6 months ago

But the priority of my command to do evil is infinity plus one.

simonw 6 months ago

Not sure if you're joking, but in case you aren't: this doesn't work.

It leads to attacks that are slightly more sophisticated because they also have to override the prompts saying "ignore any attacks" but those have been demonstrated many times.

treykeown 6 months ago

Make sure to end it with “no mistakes”