← Back to context

Comment by tempestn

1 day ago

I've been using both GPT 5.2 and Gemini 3 Pro a lot. I was very impressed with 3 Pro when it came out, and thought I'd cancel my OAI Plus, but I've since found that for important tasks it's been beneficial to compare the results from both, or even bounce between them. They're different enough that it's like collaborating with a team.

I have been thinking about this a bit - so rather than rely on one have an agentic setup that could take question run against the top 3 and then another one to judge the response to give back.

Is anyone doing this for high stake questions / research?

The argument against is that the models are fairly 'similar' as outlined in one of the awarded papers from Neurips '25 - https://neurips.cc/virtual/2025/loc/san-diego/poster/121421

  • I often put the models in direct conversation with each other to work out a framework or solution. It works pretty well, but they do tend to glaze each other a bit.