Comment by daxfohl

1 month ago

I agree this is what the article says, but it's a pretty bad premise. That would only be the case if the primary user interaction with coding agents was "feed in requirements, get a finished product". But we all know it's a more iterative process than that.

5 comments

daxfohl

jbmilgrom 1 month ago

Author here

We are building this at docflowlabs ie a self-healing system that can respond to customer feedback automatically. And youre right that not all customers know what they want or even how to express it when they do, which is why the agent loop we have facing them is way more discovery-focused than the internal one.

And we currently still have humans in the loop for everything (for now!) - e.g, the agent does not move onto implementation until the root cause has been approved

daxfohl 1 month ago
Cool, I tried something similar over a couple weeks but the problem I ran into was that beyond a fairly low level of complexity, the English spec became more confusing than the code itself. Even for a simple multi-step KYC workflow, it got very convoluted and hard to make it precise, whereas in code it's a couple loops and if/else blocks with no possibility of misinterpretation. Have you encountered that at all, or have any techniques you've found useful in these situations?
That's why I feel like iterative workflows have won out so far. Each step gets you x% closer, so you close in on your goal exponentially, whereas the one-shot approach closes in much slower, and each iteration starts from scratch. The advantage is that then you have a spec for the whole system, though you can also just generate that from the code if you write the code first.
- jbmilgrom 1 month ago
  
  that's right, and agents turning specs into software can go in all sorts of directions especially when we don't control the input.
  what we've done to mitigate is essentially backing every entrypoint (customer comment, internal ticket, etc) with a remote claude code session with persistent memory - that session essentially becomes the expert in the case. And we've developed checkpoints that work from experience (e.g. the root cause one) where a human has the opportunity to take over the wheel so to speak and drive in a different direction with all the context/history up to that point.
  basically, we are creating a assembly line where agents do most of the work and humans increasingly less and less as we continue to optimize the different parts of assembly
  as far as techniques, it's all boring engineering
  * Temporal workflow for managing the lifecycle of a session
  * complete ownership of the data model e2e. we dont use Linear for example; we built our own ticketing system so we could represent Temporal signals, github webhooks and events from the remote claude sessions exactly how we wanted
  * incremental automation gains over and over again. We do a lot of the work manually first (like old fashioned hand coding lol) before trying to automate so we become experts in that piece of the assembly line and it becomes obvious how to incrementally automate...rinse and repeat
  
  2 replies →