Comment by rtp4me
8 hours ago
I never trust the opinion of a single LLM model anymore - especially for more complex projects. I have seen Claude guarantee something is correct and then immediately apologize when I feed a critical review by Codex or Gemini. And, many times, the issues are not minor but are significant critical oversights by Claude.
My habit now: always get a 2nd or 3rd opinion before assuming one LLM is correct.
Happy to see someone else doing this.
All code written by an LLM is reviewed by an additional LLM. Then I verify that review and get one of the agents to iterate on everything.
Agreed. From my experience, Claude is the top-level coder, Gemini is the architect, and Codex is really good at finding bugs and logic errors. In fact, Codex seems to perform better deep analysis than the other two.
I just round robin them until I run out on whatever subscription level I'm on. I only use claude api, so I pay per token there... I consider using claude as "bringing out the big guns" because I also think it's the top-level coder.
It doesn’t have to be different foundation models. As long as the temperature is up, as the same model 100 times.