← Back to context

Comment by andai

1 month ago

Yesterday I asked ChatGPT to riff on a humorous Pompeii graffiti. It said it couldn't do that because it violated the policy.

But it was happy to tell me all sorts of extremely vulgar historical graffitis, or to translate my own attempts.

What was illegal here, it seemed, was not the sexual content, but creativity in a sexual context, which I found very interesting. (I think this is designed to stop sexual roleplay. Although I think OpenAI is preparing to release a "porn mode" for exactly that scenario, but I digress.)

Anyway, I was annoyed because I wasn't trying to make porn, I was just trying to make my friend laugh (he is learning Latin). I switched to Claude and had the opposite experience: shocked by how vulgar the responses were! That's exactly what I asked for, of course, and that's how it should be imo, but I was still taken aback because every other AI had trained me to expect "pg-13" stuff. (GPT literally started its response to my request for humorous sexual graffiti with "I'll keep it PG-13...")

I was a little worried that if I published the results, Anthropic might change that policy though ;)

Anyway, my experience with Claude's ethics is that it's heavily guided by common sense and context. For example, much of what I discuss with it (spirituality and unusual experiences in meditation) get the "user is going insane, initiate condescending lecture" mode from GPT. Whereas Claude says "yeah I can tell from context that you're approaching this stuff in a sensible way" and doesn't need to treat me like an infant.

And if I was actually going nuts, I think as far as harm reduction goes, Claude's approach of actually meeting people where they are makes more sense. You can't help someone navigate an unusual worldview by rejecting an entirely. That just causes more alienation.

Whereas blanket bans on anything borderline, comes across not as harm reduction, but as a cheap way to cover your own ass.

So I think Anthropic is moving even further in the right direction with this one. Focusing on deeper underlying principles, rather than a bunch of surface level rules. Just for my experience so far interacting with the two approaches, that definitely seems like the right way to go.

Just my two cents.

(Amusingly, Claude and GPT have changed places here — time was when for years I wanted to use Claude but it shut down most conversations I wanted to have with it! Whereas ChatGPT was happy to engage on all sorts of weird subjects. At some point they switched sides.)

Oh good, maybe in the future I can get a job doing erotic roleplay for hire when my software dev job gets devoured

Yesterday it said it couldn't directly reference some text I pasted because it contained a curse word, but it did offer to remove all the curse words.

ChatGPT self-censoring went through the roof after v5, and it was already pretty bad before.