Comment by CuriouslyC

7 hours ago

Gemini had the best long context support for the longest time, and even now at >400k tokens it's still got the best long context recall.

Gemini is just not trained for autonomy/tool use/agentic behavior to the same degree as the other frontier models. Goog seems to emphasize video/images/scientific+world knowledge.

My experience is it advertises large context and then just becomes incoherent and confused as it climbs to fill that context.

e.g. it sucks at general tool use but sucks even more at it after a chunk of time in a session. One frustrating situation is to watch it go into a loop trying and failing to edit source files.

I often wonder how my old coworkers from Google get by, if this is the the agentic coding they have available to them for working on projects on Google3. But I suspect the models they work with have been fine tuned on Google's custom tooling and perform better?