Comment by cheema33
2 months ago
> I'm sure one could solve that more generally, by putting the agent writing the code in a loop with some other code reviewing agent.
This x 100. I get so much better quality code if I have LLMs review each other's code and apply corrections. It is ridiculously effective.
Can you elaborate a little more on your setup? Are you manually copyong and pasting code from one LLM to another, or do you have some automated workflow for this?
No manual copy paste. That is not good use of time. I work in a git repo and point multiple LLMs at it.
One LLM reviews existing code and the new requirement and then creates a PRD. I usually use Augment Code for this because it has a good index of all local code.
I then ask Google Gemini to review the PRD and validate it and find ways to improve it. I then ask Gemini to create a comprehensive implementation plan. It frequently creates a 13 step plan. It would usually take me a month to do this work.
I then start a new session of Augment Code, feed it the PRD and one of the 13 tasks at a time. Whatever work it does, it checks it in a feature branch with detailed git commit comment. I then ask Gemini to review the output of each task and provide feedback. It frequently finds issues with implementation or areas of improvement.
All of this managed by using git. I make LLMs use git. I think would go insane if I had to copy/paste this much stuff.
I have a recipe of prompts that I copy/paste. I am trying to find ways to cut that down and making slow progress in this regard. There are tools like "Task Master" (https://github.com/eyaltoledano/claude-task-master) that do a good job of automating this workflow. However this tool doesn't allow much customization. e.g. Have LLMs review each other's work.
But, maybe I can get LLMs to customize that part for me...
I have been doing this with claude code and openai codex and/or cline. One of the three takes the first pass (usually claude code, sometimes codex), then I will have cline / gemini 2.5 do a "code review" and offer suggestions for fixes before it applies them.