Comment by svantana

1 day ago

I suspect that the main goal here was to grab the top spot in a bunch of benchmarks, and being counted as an "available" model.

2 comments

svantana

llm_nerd 1 day ago

They're using it as a major inducement to upgrade to AI Ultra. I mean, the image and video stuff is neat, but adds no value for the vast majority of AI subscribers, so right now this is the most notable benefit of paying 12x more.

FWIW, Google seems to be having some severe issues with oddball, perhaps malfunctioning quota systems. I'm regularly finding extraordinarily little use of gemini-cli is hitting the purported 1000 request limit, when in reality I've done less than 10.

hirako2000 1 day ago

I faced the exact same problem, with the API. It seems that it doesn't throttle early enough, then may cumulate the cool off period, malong it impossible to determine when to fire requests again.
Also, I noticed Gemini (even flash) has Google search support. But only via the web UI or the native mobile app. Via the API that would requires serp via MCP of sort. Even with Gemini pro.
Oh, some models are regularly facing outages. 503s are not uncommon. No SLA page, alerts, whatsoever.
The reasoning feature is buggy, even if disabled, it sometimes triggers anyway.
It occured to me the other day that Google probably have the best engineers given how good Gemini performs and where it's coming from, and the context window that is uniquely large compared to any other model. But that it is likely operated by managers coming from AWS where shipping half baked, barely tested software, was all it took to get a bonus.