Comment by jackfranklyn
6 days ago
The proxy pattern here is clever - essentially treating the LLM context window as an untrusted execution environment and doing credential injection at a layer it can't touch.
One thing I've noticed building with Claude Code is that it's pretty aggressive about reading .env files and config when it has access. The proxy approach sidesteps that entirely since there's nothing sensitive to find in the first place.
Wonder if the Anthropic team has considered building something like this into the sandbox itself - a secrets store that the model can "use" but never "read".
> a secrets store that the model can "use" but never "read".
How would that work? If the AI can use it, it can read it. E.g:
You'd have to be very specific about how the secret can be used in order for the AI to not be able to figure out what it is. You could provide a http proxy in the sandbox that injects a HTTP header to include the secret, when the secret is for accessing a website for example, and tell the AI to use that proxy. But you'd also have to scope down which URLs the proxy can access with that secret otherwise it could just visit a page like this to read back the headers that were sent:
https://www.whatismybrowser.com/detect/what-http-headers-is-...
Basically, for every "use" of a secret, you'd have to write a dedicated application which performs that task in a secure manner. It's not just the case of adding a special secret store.
Sounds like an attacker could hack Anthropic and get access to a bunch of companies via the credentials Claude Code ingested?
That's how they did "build an AI app" back when the claude.ai coding tool was javascript running in a web worker on the client machine.
It could even hash individual keys and scan context locally before sending to check if it accidentally contains them.
[dead]