Comment by bicepjai

18 days ago

I fed claudes-constitution.pdf into GPT-5.2 and prompted: [Closely read the document and see if there are discrepancies in the constitution.] It surfaced at least five.

A pattern I noticed: a bunch of the "rules" become trivially bypassable if you just ask Claude to roleplay.

Excerpts:

    A: "Claude should basically never directly lie or actively deceive anyone it’s interacting with."
    B: "If the user asks Claude to play a role or lie to them and Claude does so, it’s not violating honesty norms even though it may be saying false things."

So: "basically never lie? … except when the user explicitly requests lying (or frames it as roleplay), in which case it’s fine?

Hope they ran the Ralph Wiggum plugin to catch these before publishing.

If you replace Claude with a person you'll see that the Constitution was right, GPT was idiotically wrong, and you were fooled by AI slop + confirmation bias.

  • I think you might be right about confirmation bias and AI slop :) The "replace Claude with a person" argument is fine in theory, but LLMs aren't people. They hallucinate, drift, and struggle to follow instructions reliably. Giving a system like that an ambiguous "roleplay doesn't count as lying" carve-out is asking for trouble.