Comment by manishsharan
4 days ago
Please don't take offense to this very dumb question:
Why can't you do the planning ? Figure out what needs to be done , break it down into small tasks and then ask the agent to execute those small tasks?
When we executed projects in the past, this is what I would do as a lead: figure out the overall software architecture and delegate the tasks to developers.
This way I always knew how the system worked and could extend it as needed. I am not in development role anymore but I am trying to understand why we are delegating planning and software architecture to coding agents?
The kinds of detailed (and excellent) plans Opus or Fable can generate on our large code base would take me maybe 1-2 days to work through and they do in 10-20 minutes.
Maybe I spent 2-4 hours reviewing it, checking things with colleagues etc.
Then I press "go" and maybe an hour later I have a tested system ready for manual review.
It's plans are at least as good as any I've seen. Their weakness is if there are unstated assumptions I have about how things need to be done, so most of my time is now getting those assumptions stated properly and then reviewing.
Why wouldn't I use this? It's the best tool I've used in my 30 years of professional programming.
Did you manage to setup a discussion with the agent to reveal such assumptions? Sometimes the shave wrong unstated assumptions when contradicted by evidence, but if we’re taking about a plan for the future the evidence is thin.
> Did you manage to setup a discussion with the agent to reveal such assumptions?
This is what the plan review is for.
Usually it will have something like "modify abc.ts to update the widget number in the wnx collection" and I'm "hang on - why does that need to be updated when XYZ" and that subsequent discussion will reveal assumptions that are not shared.
Cognitive debt
Non sequitur
I think we'd be talking past each other in terms of what "planning" means, but i wrote this anyways:
You're wrong about what i mean with delegating architecture to coding agent.
I'll let the coding agent take the first shot at it, already having in my mind a decent idea about how i'd do that. Worst case its wrong i need to correct it, more than half the time it comes up with the same sort of design, sometimes it comes up with a better alternative.
Additionally, the same pattern of: "sometimes wrong, mostly good, sometimes better" also plays out wrt naming things. I thought i was decent at naming things, but an the LLM is literally build on turning 'concepts' in a vector space into words.
And in a very real way the names its choosing will 'compress' the ideas so that the next time an LLM reads it is more likely to understand.
For this to work though you need your complete system accessible and well structured.
You say "I always knew how the system worked and could extend it as needed". If an AI can't learn how your system works then that's a problem with the system setup, not the AI. An AI can find its way in the linux kernel or chromium source code just fine.
If you're in a role where you only spend time planning / architecture, then i assume things are pretty gnarly to begin with. The thing i can only guess at - and which is on a spectrum - how much of our role exists to support the weight of accidental vs essential complexity.
i.e. can the engineers not do the planning because: they're not that good, or its very broad things that need to expertly interplay with each other, or because the org has a mountain of buried bodies.
In my experience some of the more fanatic AI people are blind to the mountain of buried bodies covering a lot of essential complexity, but others can be blind to how well AI works when you can just shoot of a prompt to unbury a body and actually reduce the debt.
But in one sentence:
> Why can't you do the planning ?
This way lets me do more planning - planning is basically all i do now.
I could do the planning but I don't, for the same reason that I could write the source code but I don't, for the same reason that I could write the machine code but I don't.
This is more or less what I do. Then again, I work on a small parts of the codebase at a time, so maybe the autonomous agent works better when you're doing larger refactors over large codebases.
Even in that situation, I think I would still only feel comfortable approaching the task as I would do it without AI, and using the AI to accelerate the parts that would be time-consuming. E.g. finding where/how feature X is implemented, how it would affect the overall system if I were to change it this way, etc.
whatever you delegated in the past probably also required planning by the engineer that went down and got it done, most planning done by agents is at this same level, agent explores the codebase, understands where to touch, tradeoffs, code-level architecture, and ask the user for more context or balance with assumptions and other patterns already present in code
People get defensive when you ask this, because the they think you’re saying they’re being lazy.
…but it’s than just that (in most cases; I am just lazy sometimes); but fundamentally there’s a limit to how much complexity people can comprehend.
We are good at working at high level abstractions, modules with clear apis that can be sprung to together into some kind of feature.
You don’t need to look inside the black box of the module if you trust the implementer; Ive never opened up the internals of a calendar be like “how does this work?”. I just don’t care. It’s a calendar. I use the api.
I think most people are using these tools in this way; very few people are having an agent write a plan, then a sub agent review it, no human in the loop. Those are for prototypes and are yolo cowboys using open claw and playing with the phones instead of working; we have a few at work, but their PRs are regularly rejected as slop.
…but, realistically; many people aren’t software architects. They may not even know coding patterns, forget architecture patterns.
Having an agent spit out generic software architecture is probably better than what they were producing before.
Writing a module / feature using generic architecture and planning is probably better than random code spaghetti right?
It’s easy to lament the loss of craft here, but at the end of the day, the models today do an ok job of this. The models of tomorrow will probably be better at it than many people.
Architecture is easy composed to actually implementing things. You just wave your hands from your ivory tower and say “more event sourcing”.
"Having an agent spit out generic software architecture is probably better than what they were producing before."
If they were a poor programmer/architect, I don't think the AI would make the end result any better. It would amplify their lack of skill. Sure, the low-level code might be more airtight and idiomatic, but that's not even where poor skill really manifests itself. It's at the higher level of thinking in terms of the system and understanding the proper context of the business/technology, etc.
This simply isn't true anymore.
High level generic advice from agents is often, in my experience significantly better, unmodified, than doing nothing.
Obviously its better to do it properly, but you know… opus 4.8 is a pretty great model.
You might be surprised at the quality of the planning, architecture and task breakdown that a simple prompt with some context hints can give you.
…at the end of the day, if I’m working with someone and they give me 6/10 plans based on AI instead of stupid/10 plans they dreamed up, or 0/10 plans they didn't even bother (or in too much of a hurry) to write; Ill take it.
Tragedy of the commons? /shrug
You gotta be pragmatic. It turns subpar contributors into useful contributors.