Comment by metawake
2 months ago
I made a small project (https://github.com/metawake/puppetry-detector) to detect this type of LLM policy manipulation. It's an early idea using a set of regexp patterns (for speed) and a couple of phases of text analysis. I am curious if it's any useful, I created integration with Rebuff (loss security suite) just in case.
No comments yet
Contribute on Hacker News ↗