Comment by Roark66

2 months ago

I found the winning combination is to use all of them in this way: - first you need a vendor agnostic tool like opencode (I had to add my own vendors as it didn't support it out of the box properly) - second you set up agents with different models. I use: - for architecture and planning - opus, Sonet, gpt 5.2, gemini3 (depending on specifics, for example I found got better in troubleshooting, Sonet better in pure code planning, opus better in DevOps, Gemini the best for single shot stuff) - for execution of said plans (Qwen 2.5 Coder 30B - yes, it's even better in my use cases than Qwen3 despite benchmarks, Sonet - only when absolutely necessary, Qwen3-235B - between Qwen 2.5 and Sonet) - verification (Gemini 3 flash, Qwen3-480B etc)

The biggest saving you make is by making the context smaller and where many turns are required going for smaller models. For example a single 30min troubleshooting session with Gemini 3 can cost $15 if you run it "normally" or it can cost $2 if you use the agents, wipe context after most turns (can be done thanks to tracking progress in a plan file)

0 comments

Roark66

No comments yet

Contribute on Hacker News ↗