Comment by jackfranklyn

1 month ago

The proxy pattern here is clever - essentially treating the LLM context window as an untrusted execution environment and doing credential injection at a layer it can't touch.

One thing I've noticed building with Claude Code is that it's pretty aggressive about reading .env files and config when it has access. The proxy approach sidesteps that entirely since there's nothing sensitive to find in the first place.

Wonder if the Anthropic team has considered building something like this into the sandbox itself - a secrets store that the model can "use" but never "read".

8 comments

jackfranklyn

mike-cardwell 20 days ago

> a secrets store that the model can "use" but never "read".

How would that work? If the AI can use it, it can read it. E.g:

    secret-store "foo" > file
    cat file

You'd have to be very specific about how the secret can be used in order for the AI to not be able to figure out what it is. You could provide a http proxy in the sandbox that injects a HTTP header to include the secret, when the secret is for accessing a website for example, and tell the AI to use that proxy. But you'd also have to scope down which URLs the proxy can access with that secret otherwise it could just visit a page like this to read back the headers that were sent:

https://www.whatismybrowser.com/detect/what-http-headers-is-...

Basically, for every "use" of a secret, you'd have to write a dedicated application which performs that task in a secure manner. It's not just the case of adding a special secret store.

ashwinr2002 18 days ago

This seems like an under-rated comment. You are right, this is a vulnerability and the blog doesn't talk about this.

ipython 20 days ago

I guess I don't understand why anyone thinks giving an LLM access to credentials is a good idea in the first place? It's been demonstrated best practice to separate authentication/authorization from the LLM's context window/ability to influence for several years now.

We spent the last 50 years of computer security getting to a point where we keep sensitive credentials out of the hands of humans. I guess now we have to take the next 50 years to learn the lesson that we should keep those same credentials out of the hands of LLMs as well?

I'll be sitting on the sideline eating popcorn in that case.

JoshuaDavid 20 days ago

That's how they did "build an AI app" back when the claude.ai coding tool was javascript running in a web worker on the client machine.

ironbound 20 days ago

Sounds like an attacker could hack Anthropic and get access to a bunch of companies via the credentials Claude Code ingested?

iterateoften 20 days ago

It could even hash individual keys and scan context locally before sending to check if it accidentally contains them.

edstarch 20 days ago

While sandboxing is definitely more secure... Why not put a global deny on .env-like filename patterns as a first measure?

mockbuild 20 days ago

[dead]