Comment by sweezyjeezy

7 months ago

Gemini is clearer but MY GOD is it verbose. e.g. look at problem 1, section 2. Analysis of the Core Problem - there's nothing at all deep here, but it seems the model wants to spell out every single tiny logical step. I wonder if this is a stylistic choice or something that actually helps the model get to the end.

6 comments

sweezyjeezy

vessenes 7 months ago

They actually do help - in that they give the model more computation time and also allow realtime management of the input context by the model. You can see this same behavior in the excessive comment writing some coding models engage in; Anthropic interviews said these do actually help the model.

johnfn 7 months ago
Gemini did not one-shot these answers; it did its thinking elsewhere (probably not released by Google) and then it consolidated it down into what you see in the PDF. From the article:
> We achieved this year’s result using an advanced version of Gemini Deep Think – an enhanced reasoning mode for complex problems that incorporates some of our latest research techniques, including parallel thinking. This setup enables the model to simultaneously explore and combine multiple possible solutions before giving a final answer, rather than pursuing a single, linear chain of thought.
I don't see any parallel thinking, e.g., so that was probably elided in the final results.
- noahgav 7 months ago
  
  Yes, because these are the answers it gave, not the thinking.

shiandow 7 months ago

Section 2 is a case by case analysis. Those are never pretty but perfectly normal given the problem.

With OpenAI that part takes up about 2/3 if the proof even with its fragmented prose. I don't think it does much better.

sweezyjeezy 7 months ago
It's not it being case by case that's my issue. I used do olympiads and e.g. for the k>=3 case I wouldn't write much more than:
"Since there are 3k - 3 points on the perimeter of the triangle to be covered, and any sunny line can pass through at most two of them, it follows that 3k − 3 ≤ 2k, i.e. k ≤ 3."
Gemini writes:
Let Tk be the convex hull of Pk. Tk is the triangle with vertices V1 = (1, 1), V2 = (1, k), V3 = (k, 1). The edges of Tk lie on the lines x = 1 (V), y = 1 (H), and x + y = k + 1 (D). These lines are shady.
Let Bk be the set of points in Pk lying on the boundary of Tk. Each edge contains k points. Since the vertices are distinct (as k ≥ 2), the total number of points on the boundary is |Bk| = 3k − 3.
Suppose Pk is covered by k sunny lines Lk. These lines must cover Bk. Let L ∈ Lk. Since L is sunny, it does not coincide with the lines containing the edges of Tk. A line that does not contain an edge of a convex polygon intersects the boundary of the polygon at most at two points. Thus, |L ∩ Bk| ≤ 2. The total coverage of Bk by Lk is at most 2k. We must have |Bk| ≤ 2k. 3k − 3 ≤ 2k, which implies k ≤ 3.
- shiandow 7 months ago
  
  I'll admit I didn't look to deeply if it could be done simpler, but surely that is still miles better than what OpenAI did? At least Gemini's can be simplified. OpenAI labels all points and then enumerates all the lines that go through them.