← Back to context

Comment by bicepjai

13 hours ago

I fed claudes-constitution.pdf into GPT-5.2 and prompted: [Closely read the document and see if there are discrepancies in the constitution.] It surfaced at least five.

A pattern I noticed: a bunch of the "rules" become trivially bypassable if you just ask Claude to roleplay.

Excerpts:

    A: "Claude should basically never directly lie or actively deceive anyone it’s interacting with."
    B: "If the user asks Claude to play a role or lie to them and Claude does so, it’s not violating honesty norms even though it may be saying false things."

So: "basically never lie? … except when the user explicitly requests lying (or frames it as roleplay), in which case it’s fine?

Hope they ran the Ralph Wiggum plugin to catch these before publishing.