Comment by anonymoose12
5 months ago
I’ve been using a flawless “jailbreak” for every iteration of ChatGPT which I came up with (it’s just a few words). ChatGPT believes whatever you tell it about morals, so it’s been easy to make erotica as long as neither the output nor prompt uses obviously bad words.
I can’t convince o1 to fall for the same. It checks and checks and checks that it’s hitting OpenAI policy guidelines and utterly neuters any response that’s even a bit spicy in tone. I’m sure they’ll recalibrate at some point, it’s pretty aggressive right now.
No comments yet
Contribute on Hacker News ↗