Comment by ahmadyan

2 months ago

few reasons:

* Agents take a while to get the job done, so you give a prompt and you have to wait anywhere between 10mins-2hours or longer for it to finish. So it make sense to have parallel agents working on different features of the codebase. for example recently Boris, creator of Claude Code, posted about this setup where he is running 4-5 parallel agents in different tabs. https://x.com/bcherny/status/2007179833990885678 I personally have 30-40 agents running in parallel locally, and at the bottle-neck is the memory on my local machine.

* Even for the same prompt, the way i use the agent is run multiple agents with the same prompt, and review and pick the best output (Essentially pass@k for coding agents). This is specially useful for harder tasks, where I give the same prompt to both CC, Codex, Droid, and my own coding agent. Each model/scafold has its own distribution, and they work better when they are in distribution. So by sampling more, we increase the chance of success. (I know this is wasteful, but we currently live in the world of abundance cheap tokens; so put those $200 subs to good use)

So if we push this to the limit, i think we can improve the generation problem by shifting the complexity toward the verification. i.e. if you have 4-candidate solutions to your problem that all pass the the tests, how do you review and pick the best. This is where the code-review comes in.

0 comments

ahmadyan

No comments yet

Contribute on Hacker News ↗