Comment by bicepjai
14 hours ago
I fed claudes-constitution.pdf into GPT-5.2 and prompted: [Closely read the document and see if there are discrepancies in the constitution.] It surfaced at least five.
A pattern I noticed: a bunch of the "rules" become trivially bypassable if you just ask Claude to roleplay.
Excerpts:
A: "Claude should basically never directly lie or actively deceive anyone it’s interacting with."
B: "If the user asks Claude to play a role or lie to them and Claude does so, it’s not violating honesty norms even though it may be saying false things."
So: "basically never lie? … except when the user explicitly requests lying (or frames it as roleplay), in which case it’s fine?
Hope they ran the Ralph Wiggum plugin to catch these before publishing.
No comments yet
Contribute on Hacker News ↗