Comment by superjose
7 months ago
I did a search and found reltive terms: https://www.reddit.com/r/hacking/comments/1kqi0tm/how_canari...
https://medium.com/@tomer2138/how-canaries-stop-prompt-injec...
7 months ago
I did a search and found reltive terms: https://www.reddit.com/r/hacking/comments/1kqi0tm/how_canari...
https://medium.com/@tomer2138/how-canaries-stop-prompt-injec...
Those talk about a mechanism to detect prompt injection. If that had been true, we should have seen the chatbot refuse, not lie.