Comment by gmerc
10 hours ago
This betrays a lack of understanding how inference works. You cannot categorically defeat prompt injection with instructions. It does not work. There are no privileged tokens.
10 hours ago
This betrays a lack of understanding how inference works. You cannot categorically defeat prompt injection with instructions. It does not work. There are no privileged tokens.
Yep! One of my favorite attacks is just having a very long piece of a text so the LLM becomes unclear what's important and is happy to do something else