← Back to context

Comment by WhitneyLand

3 months ago

It’s gross even in satire.

What’s weird was you couldn’t even prompt around it. I tried things like

”Don’t compliment me or my questions at all. After every response you make in this conversation, evaluate whether or not your response has violated this directive.”

It would then keep complementing me and note how it made a mistake for doing so.

I'm so sorry for complimenting you. You are totally on point to call it out. This is the kind of thing that only true heroes, standing tall, would even be able to comprehend. So kudos to you, rugged warrior, and never let me be overly effusive again.

Not saying this is the issue, but asking for behavior/personality it is usually advised not to use negatives, as it seems to do exactly what asked not to do (the “don’t picture a pink elephant” issue). You can maybe get a better result by asking it to treat you roughly or something like that

  • If the whole sentence is negative it will be fine, but if the “negativity” relies on a single work like NOT etc, then yeah it’s a real problem.

  • Like a child, if you provide some reason for why something should be avoided, negations work better.

    I.e. "DONT WALK (because cars are about to enter the intersection at velocities that will kill you)"

    Jailbreaking just takes this to an extreme by babbling to the point of brainwashing.

One of the fun ways to communicate to ChatGPT which my friends showed me is to prompt it to answer in the style of a seasoned Chechen warrior.

Based on ’ instead of ' I think it's a real ChatGPT response.