Comment by jstanley
11 hours ago
Conversely, I often find coding agents privileging the existing code when they could do a much better job if they changed it to suit the new requirement.
I guess it comes down to how ossified you want your existing code to be.
If it's a big production application that's been running for decades then you probably want the minimum possible change.
If you're just experimenting with stuff and the project didn't exist at all 3 days ago then you want the agent to make it better rather than leave it alone.
Probably they just need to learn to calibrate themselves better to the project context.
The tradeoff is highly contextual; it's not a tradeoff an agent can always make by inspecting the project themselves.
Even within the same project, for a given PR, there are some parts of the codebase I want to modify freely and some that I want fixed to reduce the diff and testing scope.
I try to explain up-front to the agent how aggressively they can modify the existing code and which parts, but I've had mixed success; usually they bias towards a minimal diff even if that means duplication or abusing some abstractions. If anyone has had better success, I'd love to hear your approach.
Just brainstorming, but perhaps a more tangible gradient, with social backpressure? Imagine three identical patch tools: "patch", "submit patch", and "send patch to chief architect and wait". With the "where each can be used" explained or even enforced. Having the contrast of less-aggressive options, might make it easier to encourage more aggressive refactoring elsewhere. Or pushing the impact further up the CoT, "patch'ing X requires an analysis field describing less invasive alternatives and their un/suitability; for Y, just do it, refactor aggressively".
To get the agent to think for itself sometimes it feels like I have to delete a bunch of code and markdown first. Instruction to refactor/reconsider broadly has such mild success, I find.
I'll literally run an agent & tell it to clean up a markdown file that has too much design in it, delete the technical material, and/or delete key implementations/interfaces in the source, then tell a new session to do the work, come up with the design. (Then undelete and reconcile with less naive sessions.)
Path dependence is so strong. Right now I do this flow manually but I would very much like to codify this, make a skill for this pattern that serves so well.
Tell me you’re using Codex without telling me you’re using Codex :)