← Back to context

Comment by mycall

6 hours ago

I don't know how people steer multiple agent sessions in parallel. The constant cognitive switching is exhausting and you miss moments unless you are cli jumping. It is like being an operator.

It would be exhausting, if it were always necessary. If you spend the time up-front to properly define goals and requirements in a machine+LLM verifiable way, then the agents can manage it in isolation or with minimal oversight, and present a working result that meets those criteria.

BUT: How many people do you know who can achieve sufficient clarity up front? It is a skill (or set of skills) that needs developing. It can also mean the difference between spending $20 in tokens versus $2000, and/or throwing away the result and starting from scratch again (you don't really want to touch an AI generated codebase with fundamental design flaws if you value your time and sanity)

In the meantime, deliberate checkpoints for human review are still a good idea.

My theory: behind every "10x AI coder" is a long trail of expensive failures that never made the light of day, but which they are learning from. The early adopters will therefore have a competitive advantage.

I have been thinking about this and I think it might work well for established code bases that already exhibit specific patterns that are used over and over again. E.g. a CRUD app with a OpenAPI spec, a controller layer, a service layer, a repository layer, or even good old MVC for that matter. With these established patterns it wouldn't be unthinkable to provide an orchestrator agent a set of requirements that are detailed enough for a group of agents to implement, each agent focusing on a different type of task including an Q/A agent. I'd imagine you end up with one or more PRs for you to review and squash & merge when you're happy with it.

Edit: What I mean is that these agents would work autonomously until their task is accomplished. They can ask for clarification if they require it.