Comment by Alifatisk
6 hours ago
These numbers are impressive, at least to say. It looks like Google has produced a beast that will raise the bar even higher. What's even more impressive is how Google came into this game late and went from producing a few flops to being the leader at this (actually, they already achieved the title with 2.5 Pro).
What makes me even more curious is the following
> Model dependencies: This model is not a modification or a fine-tune of a prior model
So did they start from scratch with this one?
Google was never really late. Where people perceived Google to have dropped the ball was in its productization of AI. The Google's Bard branding stumble was so (hilariously) bad that it threw a lot of people off the scent.
My hunch is that, aside from "safety" reasons, the Google Books lawsuit left some copyright wounds that Google did not want to reopen.
Google’s productization is still rather poor. If I want to use OpenAI’s models, I go to their website, look up the price and pay it. For Google’s, I need to figure out whether I want AI Studio or Google Cloud Code Assist or AI Ultra, etc, and if this is for commercial use where I need to prevent Google from training on my data, figuring out which options work is extra complicated.
As of a couple weeks ago (the last time I checked) if you are signed in to multiple Google accounts and you cannot accept the non-commercial terms for one of them for AI Studio, the site is horribly broken (the text showing which account they’re asking you to agree to the terms for is blurred, and you can’t switch accounts without agreeing first).
In Google’s very slight defense, Anthropic hasn’t even tried to make a proper sign in system.
Not to mention no macOS app. This is probably unimportant to many in the hn audience, but more broadly it matters for your average knowledge worker.
3 replies →
Oh, I remember the times when I compared Gemini with ChatGPT and Claude. Gemini was so far behind, it was barely usable. And now they are pushing the boundries.
You could argue that chat-tuning of models falls more along the lines of product competence. I don't think there was a doubt about the upper ceiling of what people thought Google could produce.. more "when will they turn on the tap" and "can Pichai be the wartime general to lead them?"
The memory of Microsoft's Tay fiasco was strong around the time the brain team started playing with chatbots.
Google was catastrophically traumatized throughout the org when they had that photos AI mislabel black people as gorillas. They turned the safety and caution knobs up to 12 after that for years, really until OpenAI came along and ate their lunch.
1 reply →
oh they were so late there were internal leaked ('leaked'?) memos about a couple grad students with $100 budget outdoing their lab a couple years ago. they picked themselves up real nice, but it took a serious reorg.
Bard was horrible compared to the competition of the time.
Gemini 1.0 was strictly worse than GPT-3.5 and was unusable due to "safety" features.
Google followed that up with 1.5 which was still worse than GPT-3.5 and unbelievably far behind GPT-4. At this same time Google had their "black nazi" scandals.
With Gemini 2.0 finally had a model that was at least useful for OCR and with their fash series a model that, while not up to par in capabilities, was sufficiently inexpensive that it found uses.
Only with Gemini-2.5 did Google catch up with SoTA. It was within "spitting distance" of the leading models.
Google did indeed drop the ball, very, very badly.
I suspect that Sergey coming back helped immensely, somehow. I suspect that he was able to tame some of the more dysfunctional elements of Google, at least for a time.
At least at the moment, coming in late seems to matter little.
Anyone with money can trivially catch up to a state of the art model from six months ago.
And as others have said, late is really a function of spigot, guardrails, branding, and ux, as much as it is being a laggard under the hood.
> Anyone with money can trivially catch up to a state of the art model from six months ago.
How come apple is struggling then?
Anyone with enough money and without an entrenched management hierarchy preventing the right people from being hired and enabled to run the project.
Apple is struggling with _productizing_ LLMs for the mass market, which is a separate task from training a frontier LLM.
To be fair to Apple, so far the only mass market LLM use case so far is just a simple chatbot, and they don't seem to be interested in that. It remains to be seen if what Apple wants to do ("private" LLMs with access to your personal context acting as intimate personal assistants) is even possible to do reliably. It sounds useful, and I do believe it will eventually be possible, but no one is there yet.
They did botch the launch by announcing the Apple Intelligence features before they are ready though.
It looks more like a strategic decision tbh.
The may want to use 3rd party or just wait for AI to be more stable to see how people actually use it instead of adding slop in the core of their product.
4 replies →
Sit and wait per usual.
Enter late, enter great.
One possibility here is that Google is dribbling out cutting edge releases to slowly bleed out the pure play competition.
Being known as a company that is always six months late than the competitors isn't something to brag about...
I was referring to a new entrant, not perpetual lag
Apple has entered the chat.
> So did they start from scratch with this one
Their major version number bumps are a new pre-trained model. Minor bumps are changes/improvements to post-training on the same foundation.
I hope they keep the pricing similar to 2.5 Pro, currently I pay per token and that and GPT-5 are close to the sweet spot for me but Sonnet 4.5 feels too expensive for larger changes. I've also been moving around 100M tokens per week with Cerebras Code (they moved to GLM 4.6), but the flagship models still feel better when I need help with more advanced debugging or some exemplary refactoring to then feed as an example for a dumber/faster model.
And also, critically, being the only profitable company doing this.
It's not like they're making their money from this though. All AI work is heavily subsidised, for Alphabet it just happens that the funding comes from within the megacorp. If MS had fully absorbed OpenAI back when their board nearly sunk the boat, they'd be in the exact same situation today.
They're not making money, but they're in a much better situation than Microsoft/OpenAI because of TPUs. TPUs are much cheaper than Nvidia cards both to purchase and to operate, so Google's AI efforts aren't running at as much of a loss as everyone else. That's why they can do things like offer Gemini 3 Pro for free.
What does it mean nowadays to start from scratch? At least in the open scene, most of the post-training data is generated by other LLMs.
They had to start with a base model, that part I am certain of