← Back to context

Comment by mrtesthah

4 days ago

Primarily, I want to run agents with a flag like --dangerously-bypass-approvals-and-sandbox to benefit from newer models' improvements in long-horizon tasks. For that to work, I need solid guarantees it won't rm -rf / my entire system while still allowing full stack access for installing and configuring databases, server daemons, etc. Ultimately that means working within some sort of container and/or VM.

Facilitating long-running tasks makes multiple parallel agents much more compelling.