Comment by lemming

1 month ago

Is anyone exploring the (imo more practically useful today) space of using agents to put together better changes vs "more commits"?

Yes, I am, although not really in public yet. I use the pi harness, which is really easy to extend. I’m basically driving a deterministic state machine for each code ticket, which starts with refining a short ticket into a full problem description by interviewing me one question at a time, then converts that into a detailed plan with individual steps. Then it implements each step one by one using TDD, and each bit gets reviewed by an agent in a fresh context. So first tests are written, and they’re reviewed to ensure they completely cover the initial problem, and any problems are addressed. That goes round a loop till the review agent is happy, then it moves to implementation. Same thing, implementation is written, loop until the tests pass, then review and fix until the reviewer is happy. Each sub task gets its own commit. Then when all the tasks are done, there’s an overall review that I look at. Then if everyone is happy the commits get squashed and we move to manual testing. The agent comes up with a full list of manual tests to cover the change, sets up the test scenarios and tells me where to debug in the code while working through each test case so I understand what’s been implemented. So this is semi automated - I’m heavily involved at the initial refine stage, then I check the plan. The various implementation and review loops are mostly hands off, then I check the final review and do the manual testing obviously.

This is definitely much slower than something like Gas Town, but all the components are individually simple, the driver is a deterministic program, not an agent, and I end up carefully reviewing everything. The final code quality is very good. I generally have 2-4 changes like this ongoing at any one time in tmux sessions, and I just switch between them. At some point I might make a single dashboard with summaries of where the process is up to on each, and whether it needs my input, but right now I like the semi manual process.

0 comments

lemming

No comments yet

Contribute on Hacker News ↗