← Back to context

Comment by chr15m

22 days ago

Approval should be mandatory for any non-read tool call. You should read everything your LLM intends to do, and approve it manually.

"But that is annoying and will slow me down!" Yes, and so will recovering from disastrous tool calls.

You’ll just end up approving things blindly, because 95% of what you’ll read will seem obviously right and only 5% will look wrong. I would prefer to let the agent do whatever they want for 15 minutes and then look at the result rather than having to approve every single command it does.

  • Works until it has access to write to external systems and your agent is slopping up Linear or GitHub without you knowing, identified as you.

    • Sure; I mean this is what I _would like_; I’m not saying this would work 100% of the time.

That kind of blanket demand doesn't persuade anyone and doesn't solve any problem.

Even if you get people to sit and press a button every time the agent wants to do anything, you're not getting the actual alertness and rigor that would prevent disasters. You're getting a bored, inattentive person who could be doing something more valuable than micromanaging Claude.

Managing capabilities for agents is an interesting problem. Working on that seems more fun and valuable than sitting around pressing "OK" whenever the clanker wants to take actions that are harmless in a vast majority of cases.

  • I don't mean to sound like I'm demanding this. I'm saying you will get better outcomes if you choose to do this as a developer.

    You're right it's an interesting problem that seems fun to work on. Hopefully we'll get better harnesses. For now I'm checking everything.

It’s not just annoying; at scale it makes using the agent clis impossible. You can tell someone spends a lot of time in Claude Code: they can type —dangerously-skip-permissions with their eyes closed.

  • Yep. The agent CLIs have the wrong level of abstraction. Needs more human in the loop.

This is like having a firewall on your desktop where you manually approve each and every connection.

Secure, yes? Annoying, also yes. Very error-prone too.

It's not reliable. The AI can just not prompt you to approve, or hide things, etc. AI models are crafty little fuckers and they like to lie to you and find secret ways to do things with alterior motives. This isn't even a prompt injection thing, it's an emergent property of the model. So you must use an environment where everything can blow up and it's fine.

  • The harness runs the tool call for the LLM. It is trivial to not run the tool call without approval, and many existing tools do this.