Comment by avree
4 days ago
Every time I've seen people use Git worktrees with agents, it's incredibly wasteful. What is the use case for running parallel isolated agents? Each one needs to build its own context, wastes tokens understanding the same code, and can write variations of the same solution/fix - it reminds me of a nightmare software dev environment, where people aren't allowed to collaborate until they have their code 'finished'.
The use case is making them work on distinct tasks in parallel — just like an organisations developers each have (traditionally) their own laptop with its own isolated environment. So that I can say to agent 1 “clean up the unit tests in the payments module” and I can say to agent 2 “implement a simple client for Mailchip so that we can migrate off Sendgrid” and the two can work independently.
Note that I don’t work like this personally — I quickly get overwhelmed by the volume of context switching — but I can absolutely see the appeal; particularly for smaller shops.
AI makes coding plans that often come up with phases. It can be interesting to ask it to skip a phase, and do the next one. You can get interesting data about prospective other futures.
IMO using subagents to generate good context is a huge win. That doesn't really require a worktree. But once you have good starting places, good contexts you can feed into a LLM, there's IMO not much concern about "build it's own context" (it's already provided right here) nor "wastes tokens" (since is GoodPut / GoodTokens).
the workflow of how we feel and build contexts is the art of this all right now. this project is on point. it's going to be a boom year for Terminal Multiplexing.
few reasons:
* Agents take a while to get the job done, so you give a prompt and you have to wait anywhere between 10mins-2hours or longer for it to finish. So it make sense to have parallel agents working on different features of the codebase. for example recently Boris, creator of Claude Code, posted about this setup where he is running 4-5 parallel agents in different tabs. https://x.com/bcherny/status/2007179833990885678 I personally have 30-40 agents running in parallel locally, and at the bottle-neck is the memory on my local machine.
* Even for the same prompt, the way i use the agent is run multiple agents with the same prompt, and review and pick the best output (Essentially pass@k for coding agents). This is specially useful for harder tasks, where I give the same prompt to both CC, Codex, Droid, and my own coding agent. Each model/scafold has its own distribution, and they work better when they are in distribution. So by sampling more, we increase the chance of success. (I know this is wasteful, but we currently live in the world of abundance cheap tokens; so put those $200 subs to good use)
So if we push this to the limit, i think we can improve the generation problem by shifting the complexity toward the verification. i.e. if you have 4-candidate solutions to your problem that all pass the the tests, how do you review and pick the best. This is where the code-review comes in.
Context building isn't the bottleneck -- developer capacity is. Tokens are cheap and getting cheaper, but dev cognitive bandwidth is fixed and expensive.
You don't want to use the same context for multiple tasks because performance suffers with context size.
The codebase can be added to the context once, the computation shared between parallel agents.
Monorepos , or rather in large enough codebases work can tend not to overlap