Comment by arjie
10 hours ago
I have never found any utility in that. After all, you can still just review the diffs and ask it for explanation for sections instead.
10 hours ago
I have never found any utility in that. After all, you can still just review the diffs and ask it for explanation for sections instead.
> After all, you can still just review the diffs
anonu has explicitly said that they've wiped a database twice as a result of agents doing stuff. What sort of diff would help against an agent running commands, without your approval?
Agent does not have to run in your user context. It is easy mistake to make in yolo mode but after that it's easy to fix. e.g. this is what I use now so I can release agent from my machine and also constrain its access:
Agent is fully capable of making PR etc. if you provide appropriate tooling. It wipes DB but DB is just separate ephemeral pod. One day perhaps it will find 0-day and break out, but so far it has not done it.
Hah I run my agent inside a docker with just the code. Anything clever it tries to do just goes nowhere.
> After all, you can still just review the diffs
The diff: +8000 -4000
You can ask it to make the changes in appropriate PRs. SOTA model + harness can do it. I find it useful to separate refactors and implementations, just like with humans, but I admittedly rely heavily on multi-provider review.