Comment by embedding-shape

3 days ago

Who has time for that? This is how I run codex: `codex --sandbox danger-full-access --dangerously-bypass-approvals-and-sandbox --search exec "$PROMPT"`, having to approve each change would effectively destroy the entire point of using an agent, at least for me.

Edit: obviously inside something so it doesn't have access to the rest of my system, but enough access to be useful.

6 comments

embedding-shape

quotemstr 3 days ago

I wouldn't even think of letting an agent work in that made. Even the best of them produce garbage code unless I keep them on a tight leash. And no, not a skill issue.

What I don't have time to do is debug obvious slop.

kees99 3 days ago
I ended up running codex with all the "danger" flags, but in a throw-away VM with copy-on-write access to code folders.
Built-in approval thing sounds like a good idea, but in practice it's unusable. Typical session for me was like:
About to run "sed -n '1,100p' example.cpp", approve? About to run "sed -n '100,200p' example.cpp", approve? About to run "sed -n '200,300p' example.cpp", approve?
Could very well be a skill issue, but that was mighty annoying, and with no obvious fix (options "don't ask again for ...." were not helping).
- quotemstr 2 days ago
  
  One decent approach (which Codex implements, and some others) is to run these commands in a real-only sandbox without approval and let the model ask your approval when it wants to run outside the sandbox. An even better approach is just doing abstract interpretation over shell command proposals.
  You want something like codex -a read-only -s on-failure (from memory: look up the exact flags)
embedding-shape 3 days ago

I keep it on a tight leash too, not sure how that's related. What gets edited on disk is very different from what gets committed.

well_ackshually 3 days ago

>Who has time for that?

People that don't put out slop, mostly.

embedding-shape 3 days ago

That's another thing entirely, I still review and manually decide the exact design and architecture of the code, with more care now than before. Doesn't mean I want the UI of the agent to need manual approval of each small change it does.