Claude Code Is Steganographically Marking Requests

2 hours ago (thereallo.dev)

Value judgment aside: I am a bit surprised at how sloppily they did this. I think they could've achieved the same effect while decreasing the odds of detection via reverse engineering like this.

(This field is known as "underhanded code", coined by the Underhanded C contest: https://www.underhanded-c.org. It's a little-known "art"; little-known for probably self-explanatory reasons. There are much cleverer ways of achieving objectives like this. One obviously being you can move more out of the client and into the server, but the other being you can write plausibly deniable client code in a much more benign-seeming way than this. Some of what they added can only be done on the client, but I think more could've been moved, and the client-required parts could've done more subtly and credibly.)

It's possible they knew the JS bundle gets so heavily scrutinized that it'd eventually get spotted and reported on regardless so they didn't bother doing something more subtle and duplicitous. But still a bit surprising.

  • At first I was agreeing with you, that this seemed like a sloppy way to implement this that was sure to be pretty quickly detected, but there is another possibility.

    Anthropic could have implemented this not as a durable detection system against proxying resellers, but instead as a point-in-time sampling system to detect where (and with what context) proxying reselling is currently happening. Sure, it would be detected eventually, but in the meantime Anthropic could gain useful snapshot data.

  • Have you looked into anything about Claude Code, how it’s configured, how it interacts with your system, etc? Because “sloppy” is a defining characteristic.

  • It’s not surprising at all, they’re vibecoding Claude code so of course they are not going to get anything other than slop out of it. A novel or clever solution is just out of the question for them.

  • It’s even more funny how this blew in their faces. They even advertised pretty much all providers on hackernews home page. Here is in case you missed in the article

    ‘’’ cn baidu.com alibaba-inc.com alipay.com antgroup-inc.cn bytedance.net kuaishou.com xiaohongshu.com jd.com bilibili.co iflytek.com stepfun-inc.com moonshot.ai anyrouter.top claude-code-hub.app claude-opus.top openclaude.me proxyai.com yunwu.ai zenmux.ai

    ‘’’

    You can view the full list here: https://cdn.thereallo.dev/blog/assets/cc-domains.js

    const knownDomains = [ "cn", "sankuai.com", "netease.com", "163.com", "baidu-int.com", "baidu.com", "alibaba-inc.com", "alipay.com", "antgroup-inc.cn", "kuaishou.com", "bytedance.net", "xiaohongshu.com", "ctripcorp.com", "jd.com", "jdcloud.com", "bilibili.co", "iflytek.com", "stepfun-inc.com", "aliyuncs.com", "cn-shanghai.fcapp.run", "cn-beijing.fcapp.run", "xaminim.com", "moonshot.ai", "anyrouter.top", "packyapi.com", "aicodemirror.com", "aigocode.com", "hongshan.com", "iwhalecloud.com", "dhcoder.net", "lemongpt.top", "zhihuiapi.top", "intsig.net", "high-five-ai.xyz", "cloudsway.net", "4sapi.com", "529961.com", "88996.cloud", "88code.ai", "88code.org", "91code.pro", "992236.xyz", "ai.codeqaq.com", "ai.hybgzs.com", "ai.kjvhh.com", "aicanapi.com", "aicoding.sh", "aifast.site", "aihubmix.com", "anmory.com", "api.5202030.xyz", "api.ablai.top", "api.bianxie.ai", "api.bltcy.ai", "api.cpass.cc", "api.dev88.tech", "api.dreamger.com", "api.expansion.chat", "api.gueai.com", "api.holdai.top", "api.ikuncode.cc", "api.lconai.com", "api.linkapi.org", "api.mkeai.com", "api.nekoapi.com", "api.oaipro.com", "api.ruyun.fun", "api.ssopen.top", "api.tu-zi.com", "api.uglycat.cc", "api.v3.cm", "api.whatai.cc", "api.wpgzs.top", "api.xty.app", "api.yuegle.com", "api.zzyu.me", "apimart.ai", "apipro.maynor1024.live", "apiyi.com", "applyj.hiapi.top", "augmunt.com", "b4u.qzz.io", "clauddy.com", "claude-code-hub.app", "claude-opus.top", "claudeide.net", "co.yes.vg", "code.wenwen-ai.com", "code.x-aio.com", "codeilab.com", "cubence.com", "deeprouter.top", "dimaray.com", "dmxapi.com", "docs.aigc2d.com", "duckcoding.com", "fk.hshwk.org", "flapcode.com", "foxcode.hshwk.org", "foxcode.rjj.cc", "fuli.hxi.me", "getgoapi.com", "gpt.zhizengzeng.com", "gptgod.cloud", "gptkey.eu.org", "gptpay.store", "hdgsb.com", "henapi.top", "instcopilot-api.com", "jeniya.top", "jiekou.ai", "kg-api.cloud", "n1n.ai", "new-api.u4vr.com", "new.xychatai.com", "one-api.bltcy.top", "one.ocoolai.com", "oneapi.paintbot.top", "open.xiaojingai.com", "openclaude.me", "opus.gptuu.com", "poloai.top", "poloapi.top", "privnode.com", "proxyai.com", "qinzhiai.com", "right.codes", "runanytime.hxi.me", "sssaicode.com", "store.zzyus.top", "tiantianai.pro", "uiuiapi.com", "uniapi.ai", "vip.undyingapi.com", "wolfai.top", "wzw.de5.net", "wzw.pp.ua", "xairouter.com", "xaixapi.com", "xiaohuapi.site", "xiaohumini.site", "xy.poloapi.com", "yansd666.com", "yansd666.top", "yunwu.ai", "yunwu.zeabur.app", "zenmux.ai", ];

    const labKeywords = [ "deepseek", "moonshot", "minimax", "xaminim", "zhipu", "bigmodel", "baichuan", "stepfun", "01ai", "dashscope", "volces", ]

    • rhoooo - so this is where to go to get cheap Claudeo at 90% off the listing price!

    • The site collection seems pretty random. There's a mix of actual AI labs, extremely questionable resellers (like whatever "claude-opus.top" is), and then random consumer sites like baidu and xiaohongshu.

Codex CLI is FOSS, unlike Claude Code, so Codex is less likely to do things like that, and it's one more reason to avoid Claude Code and Claude in general. Hopefully, many eyes will be looking into Codex for malicious things like that.

  • It's released and signed by GitHub I believe (although not deterministic builds), but there's at least a little bit of provenance that you're getting the real repository.

Can somebody clarify for me - if ANTHROPIC_BASE_URL is set to a different provider... then isn't this "marked" system prompt being sent to that provider's API rather than Anthropic's?

I understand how this can be useful to Anthropic if the 3rd-party is acting as a proxy (because they end up hitting the Claude API with the marked prompt), but it looks like requests where "hostname contains deepseek" would never be sending data to Anthropic. What am I missing?

  • My guess is for distillation, they need to forward the prompt to Anthropic to get the real Anthropic model's response so they can train their own models on it

  • The theory is probably Deepseek might be collecting those streams, and sending a portion of it to Anthropic to see what the Anthropic/Opus response would be.

  • Did I understand correctly, that custom base URL triggers this behavior? So if I'm running Claude through a LLM proxy, I'm also affected?

(This sounds like a clumsy way of catching the Chinese that easily can be side-stepped.)

Claude Code has more or less full access to the client computer. The server (that hosts the actual AI) can just go: execute this payload and tell me the result - otherwise I won't answer any further questions or re-route you to a stupider model.

The payload could check for Chinese time-zones, scan for copies of the little red book on the local hard-drive, or ping truth.social to see it was behind the great firewall.

This is weird but, help me understand how this meaningfully impacts our exposure.

I'm authenticated to Claude, so they already have the whole attribution thing solved.

“So the feature mostly punishes the exact people who are easier to fingerprint: normal developers doing weird but legitimate things”

What’s the punishment here exactly?

This is very interesting. Combating resellers and distillation seems like a very difficult problem indeed. Interesting to me is that these techniques mentioned in the article are just like anti-observation techniques used by some of the more sophisticated malware out there, however defeating them is pretty trivial.

  • Yes, defeating this is relatively easy, particularly for sophisticated actors. But it's hard to always defeat all of the tricks. Sort of like how it's expensive and hard and uncertain to defeat all of the tricks when forging money.

    Here's an example. Say you have your team use patched binaries. Then CC updates and requires a new patched binary with new tricks. You now have to have a team ready to analyze the binary and begin to address the tricks; meanwhile, unpatched code is now a fingerprint. If some researcher decides to update Claude on their own to access new features, they get fingerprinted.

    Defeating a single fingerprinting technique once is easy. Defeating all of the techniques all the time is hard.

  • seems ironically like a similar problem of content owners trying to filter bot scrapers from legit users

I used Claude Code for a month because my boss gifted me a sub and wanted me to try it.

I used that month to complete a work project and then beef up my personal harness so I'd never have to deal with Anthropic (and these sorts of shenanigans) again.

If they only collect the data for analysis I guess this is fine (they already get way more sensitive data from users anyways, so if privacy is your concern you've made the mistake many steps ago). The much more interesting question is if they directly act on this data in their API. For example by rate-limiting, compute-limiting or rerouting to weaker models. That might even be legally questionable. I would really like to see this as a follow-up analysis, but I guess it is way more difficult and will also cost quite a bit in tokens.

  • "If they only collect the data for analysis I guess this is fine"

    I think you missed the memo on how foolish this attitude is. It came out around the time Edward Snowden made his discoveries at the NSA public. I suggest you look into it

    • As I said above, if you are worried about privacy while hooking up Claude Code, you need to reevaluate your understanding of this technology.

  • I've heard that it was possible to trigger really obvious output poisoning on Fable with something as basic as asking the model to think outside of its built-in hidden thinking delimiters.

    This watermark may trigger a similar mechanism.

That's a lot of effort when they could just play a short video saying 'You wouldn't steal a car' instead

None of this is surprising - they're trying to mask and relay when they detect known patterns of what looks like distillation attacks and client app copying/modification. The list obfuscation here is likely to prevent or make it difficult for those same adversaries to work around this or delete/null it out when making a bootleg copy.

Cool reverse engineering/analysis report but if this is the extent of nefarious activity that came of it (trying to catch/mitigate chinese lab model distillations), that's kind of encouraging.

This was already discovered during the source map leak.

> This is not a malicious feature, but it is a weird choice for a developer tool that asks for trust.

They already tell you they scan for malicious prompts, and they have no ZDR guarantees for consumers. Why do signatures like this matter at all?

  • There has been an anti anthropic propaganda push by bad actors across social media sites especially Reddit and twitter. This started a few months ago when anthropic started beating openai.

What's the point of even trying to obfuscate this with such a simple method? Could at least have hidden the targeted features by storing their hashes or embedding a bloom filter or similar

  • In this case, this is probably not the only stereographic tattletale.

    Had a competitor pull something like this with a previous employer. They were supposed to be interoperating with a standard, but they had a secret steganographic handshake, which they used to pretend that competitors products were unreliable (they had a first mover position in a smaller national market with specific requirements, so this wasn't shooting themselves in the foot). Our guys figured out the handshake and just silently implemented it. In this case, the competitor wasn't big enough to waste engineering time on multiple such hacks, but Anthropic have time (or Claude does).

This seems really, really stupid. Similar to the weird Zig runtime signature thing from a few months ago ago, it was bound to be discovered, quickly, and all the resellers have to do is find a new domain name that (checks notes) doesn't have the word DEEPSEEK in it. Like, seriously? Your goal was to identify resellers by checking if the proxy has the corporate name of one of your competitors in it? Is this amateur hour?

All Anthropic has done is reduce trust, once again, with legitimate customers, while doing nothing to stop illegitimate customers. They need to get adults into key leadership roles, quickly.

Frankly, I don't see this as the concerning behaviour the article describes. It is fine to try to protect against distillation through a technique like this. This will also allow them to, instead of blocking the distillation agents, respond with a poorer result/model, hindering the progress of distillation, momentarily at least.

I would guess that's their first line of defense; they should have more techniques to identify distillation because that's a very simple way of detecting the host and can be easily spoofed.

  • > This will also allow them to, instead of blocking the distillation agents, respond with a poorer result/model,

    i.e. this will allow them to literally commit fraud against paying customers

    • 1st, this technique is not fraud, and fraud is a separate accusation. 2nd, paying customers can legally and legitimately be banned and monitored for breaking terms of service, which probably includes things like using the model against U.S. export restrictions.

      2 replies →

    • That's what capitalism is all about, baby! Especially if the customers don't notice.

The AI race right now is in a sad state. Chinese's playbook is releases open weight models and trains them on their own chips.

Anthropic pushes fear and control. But the only way to win is by innovating. China is flooding the market with cheap, good enough models, while the U.S. is building a Chinese firewall.

If there weren't already enough tells that something is AI-generated, I guess you could add this to the list.

Headline is, frankly, awful. This isn't the AI secretly doing stuff and hiding it. This is the very human Anthropic engineers trying to detect Chinese scraping via some frankly hamfisted and unimaginative URL trickery.

  • I didn't assume it was the AI, just that some part of the the overall Claude Code product was doing this. I didn't assume the feature was added to Claude Code without human oversight. If it was added by Claude-the-AI itself without the humans prompting it to I would still hold the humans at Anthropic responsible. Does that make you feel better?

Here's the sha of the prompt I submitted... no I don't know why there are no saved prompts with that sha.

What do you mean you don't know where the bug is coming from?

No, I absolutely didn't make it up, how could you accuse me of that?

Does anyone know when this regex isn't working? I double checked it 27 times, I even asked the LLM. They all say this regex should be finding these dates.

Weird, suddenly all the conversations are breaking when I feed them into this other tool? Something about UTF-8 errors, but I'm sure I'm only using ASCII?

I do try to take care to make sure the things I build can be used by other people even when they care about different things. I care about understandably, determinism (as it relates to computing), and repeatability (because I want to be able to trust the systems I use).

If y'all would be willing to try to account for use cases of others, and try not to break them... that would be nice.

Please note: that generally when you modify something that belongs to someone else without telling them... things should be expected to break.

[flagged]

  • Is it worse than the companies that built the agent and gave no credit for the data they used?

  • Would you also say that "someone who wants to use an IDE / LSP features to code and not give credit to the IDE / LSP is the worst kind of person"? If not, what is the difference between the two for you?

    • > Would you also say that "someone who wants to use an IDE / LSP features to code and not give credit to the IDE / LSP is the worst kind of person"?

      That's a false equivalency.

      > If not, what is the difference between the two for you?

      Let's start this out right: if they're equivalent, first you explain to us why you think so.

      5 replies →

[flagged]

  • If scrapping content is legal, model distillation should be legal too.

    • > If scrapping content is legal, model distillation should be legal too.

      No, because legality should be determined by what's in the best interests of Athropic and OpenAI's business models.

      Hopefully they're working on RLHF their models to insert clauses making that reality clear into any legislation their models generate or review. That way it's only a matter of time until the confusion is cleared up.

    • I suppose model distillation is technically legal, in terms of copyright, because LLM output is automatically public domain.

      It's only "illegal" from a standpoint of breach of contract given its against the terms of use/service, which is to say its not illegal at all, there's no criminality there.

  • There are so many China born Chinese employees at Anthropic and OpenAI and I think quite a lot of them have already been recruited as spy . So it is almost impossible to keep secrets from Chinese government.

  • At what point though doesnt somebody stand back and say "wow, thats really dumb!" I think its probably more an indication of a dev having too much time on their hands rather than being in a hurry.

  • > steal the models or illegally distill them

    Oh no, they're trying to steal the models that were trained on stolen data? That's horrible, I feel so bad for Anthropic.

The more I learn about Anthropic the more they disgust me. Finger crossed for all the companies from their “ban list”