Comment by anonymoose12

1 year ago

I’ve been using a flawless “jailbreak” for every iteration of ChatGPT which I came up with (it’s just a few words). ChatGPT believes whatever you tell it about morals, so it’s been easy to make erotica as long as neither the output nor prompt uses obviously bad words.

I can’t convince o1 to fall for the same. It checks and checks and checks that it’s hitting OpenAI policy guidelines and utterly neuters any response that’s even a bit spicy in tone. I’m sure they’ll recalibrate at some point, it’s pretty aggressive right now.

0 comments

anonymoose12

No comments yet

Contribute on Hacker News ↗