Comment by harikb

10 hours ago

On the credentials point. Here is what I find.

Day 1: Carefully handles the creds, gives me a lecture (without asking) about why .env should be in .gitignore and why I should edit .env and not hand over the creds to it.

Day 2: I ask for a repeat, has lost track of that skill or setting, frantically searches my entire disk, reads .env including many other files, understands that it is holding a token, manually creates curl commands to test the token and then comes back with some result.

It is like it is a security expert on Day 1 and absolute mediocre intern on Day 2

2 comments

harikb

eterm 9 hours ago

I found the same, it was super careful handling the environment variable until it hit an API error, and I caught in it's thinking "Let me check the token is actually set correctly" and it just echoed the token out.

( This was low-stakes test creds anyway which I was testing with thankfully. )

I never pass creds via env or anything else it can access now.

My approach now is to get it to write me linqpad scripts, which has a utility function to get creds out of a user-encrypted share, or prompts if it's not in the store.

This works well, but requires me to run the scripts and guide it.

Ultimately, fully autotonous isn't compatible with secrets. Otherwise, if it really wanted to inspect it, then it could just redirect the request to an echo service.

The only real way is to deal with it the same way we deal with insider threat.

A proxy layer / secondary auth, which injects the real credentials. Then give claude it's own user within that auth system, so it owns those creds. Now responsibilty can be delegated to it without exposing the original credentials.

That's a lot of work when you're just exploring an API or DB or similar.

jbreckmckye 6 hours ago

I think it is just because they are having to load shed! Some days you may be getting much less compute - the main way "thinking" operates, is to just iterate on the result a few more times