← Back to context

Comment by kees99

1 day ago

I ended up running codex with all the "danger" flags, but in a throw-away VM with copy-on-write access to code folders.

Built-in approval thing sounds like a good idea, but in practice it's unusable. Typical session for me was like:

  About to run "sed -n '1,100p' example.cpp", approve?
  About to run "sed -n '100,200p' example.cpp", approve?
  About to run "sed -n '200,300p' example.cpp", approve?

Could very well be a skill issue, but that was mighty annoying, and with no obvious fix (options "don't ask again for ...." were not helping).

One decent approach (which Codex implements, and some others) is to run these commands in a real-only sandbox without approval and let the model ask your approval when it wants to run outside the sandbox. An even better approach is just doing abstract interpretation over shell command proposals.

You want something like codex -a read-only -s on-failure (from memory: look up the exact flags)