Comment by shiandow

14 hours ago

Comparing the answers between Openai and Gemini the writing style of Gemini is a lot clearer. It could be presented a bit better but it's easy enough to follow the proof. This also makes it a lot shorter than the answer given by OpenAI and it uses proper prose.

11 comments

shiandow

aabhay 8 hours ago

Based on the Google presented answers, its possible that the report is generated post-hoc as a summarization of the prior thoughts. One could also presume that this summarization step is part of the mechanism for running the Tree of Thoughts too, so that this wasn’t some manual “now give the final answer” command.

cubefox 14 hours ago

I found the proofs you were referring to:

Google https://storage.googleapis.com/deepmind-media/gemini/IMO_202...

OpenAI https://github.com/aw31/openai-imo-2025-proofs/

sweezyjeezy 13 hours ago
Gemini is clearer but MY GOD is it verbose. e.g. look at problem 1, section 2. Analysis of the Core Problem - there's nothing at all deep here, but it seems the model wants to spell out every single tiny logical step. I wonder if this is a stylistic choice or something that actually helps the model get to the end.
- vessenes 11 hours ago
  
  They actually do help - in that they give the model more computation time and also allow realtime management of the input context by the model. You can see this same behavior in the excessive comment writing some coding models engage in; Anthropic interviews said these do actually help the model.
  
  2 replies →
- shiandow 12 hours ago
  
  Section 2 is a case by case analysis. Those are never pretty but perfectly normal given the problem.
  With OpenAI that part takes up about 2/3 if the proof even with its fragmented prose. I don't think it does much better.
  
  2 replies →
CamperBob2 9 hours ago
Kind of disappointing that neither provider shows the unsuccessful attack on problem 6.
- cubefox 7 hours ago
  
  They don't show any reasoning traces at all, just the final proofs. We must assume the traces are pretty huge, since at least Google makes it clear that they are heavily relying on inference compute:
  > We achieved this year’s result using an advanced version of Gemini Deep Think – an enhanced reasoning mode for complex problems that incorporates some of our latest research techniques, including parallel thinking. This setup enables the model to simultaneously explore and combine multiple possible solutions before giving a final answer, rather than pursuing a single, linear chain of thought. [...] We will be making a version of this Deep Think model available to a set of trusted testers, including mathematicians, before rolling it out to Google AI Ultra subscribers.