Comment by hebejebelus
16 hours ago
The constitution contains 43 instances of the word 'genuine', which is my current favourite marker for telling if text has been written by Claude. To me it seems like Claude has a really hard time _not_ using the g word in any lengthy conversation even if you do all the usual tricks in the prompt - ruling, recommending, threatening, bribing. Claude Code doesn't seem to have the same problem, so I assume the system prompt for Claude also contains the word a couple of times, while Claude Code may not. There's something ironic about the word 'genuine' being the marker for AI-written text...
You're absolutely right!
You're looking at this exactly the right way.
What you're describing is not just true, it's precise.
5 replies →
do LLMs arrive at these replies organically? Is it baked into the corpus and naturally emerges? Or are these artifacts of the internal prompting of these companies?
1 reply →
It's not just a word— it's a signal of honesty and credibility.
Perfect!
Now that you mention it, a funny expression considering the supposed emphasis they have on honesty as a guiding principle.
I apologize for the oversight
Ah, I see the problem now.
How can problems be real if our eyes aren't real
I feel there should be a database of shibboleths such as this as it would really change how you look at anything written on the internet.
The wikipedia page Signs of AI Writing is quite a good one: https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing
But it's a game of whackamole really, and already I'm sure I'm reading and engaging with some double-digit percentage of entirely AI-written text without realising it.
maybe it uses the g word so much BECAUSE it’s in the constitution…
I expect they co-authored the constitution and other prior 'foundational documents' with Claude, so it's probably a chicken-and-egg thing.
I believe the constitution is part of its training data, and as such its impact should be consistent across different applications (eg Claude Code vs Claude Desktop).
I, too, notice a lot of differences in style between these two applications, so it may very well be due to the system prompt.
I would like to see more agent harnesses adopt rules that are actually rules. Right now, most of the "rules" are really guidelines: the agent is free to ignore them and the output will still go through. I'd like to he able to set simple word filters and regenerate that can deterministically block an output completely, and kick the agent back into thinking to correct it. This wouldn't have to be terribly advanced to fix a lot of slop. Disallow "genuine," disallow "it's not x, it's y," maybe get a community blacklist going a la adblockers.
Seems like a postprocess step on the initial output would fix that kind of thing - maybe a small 'thinking' step that transforms the initial output to match style.
Yeah, that's how it would be implemented after a filter fail, but it's important that the filter itself be separate from the agent, so it can be deterministic. Some problems, like "genuine," are so baked in to the models that they will persist even if instructed not to, so a dumb filter, a la a pre-commit hook, is the only way to stop it consistently.
You are probably right but without all the context here one might counter that the concept of authenticity should feature predominantly in this kind of document regardless. And using a consistent term is probably the advisable style as well: we probably don't need "constitution" writers with a thesaurus nearby right?
Perhaps so, but there are only 5 uses of 'authentic' which I feel is almost an exact synonym and a similarly common word - I wouldn't think you need a thesaurus for that one. Another relatively semantically close word, 'honest' shows up 43 times also, but there's an entire section headed 'being honest' so that's pretty fair.
There's also an entire section on "what constitutes genuine helpfulness"
1 reply →