Comment by ActorNightly

7 days ago

Yes. Google is probably gonna win the LLM game tbh. They had a massive head start with TPUs which are very energy efficient compared to Nvidia Cards.

The only one who can stop Google is Google.

They’ll definitely have the best model, but there is a chance they will f*up the product / integration into their products.

  • It would take talent for them to mess up hosting businesses who want to use their TPUs on GCP.

    But then again even there, their reputation for abandoning products, lack of customer service, condescension when it came to large enterprises’ “legacy tech” lets Microsoft who is king of hand holding big enterprise and even AWS run rough shod over them.

    When I was at AWS ProServe, we didn’t even bother coming up with talking points when competing with GCP except to point out how they abandon services. Was it partially FUD? Probably. But it worked.

    • >It would take talent for them to mess up hosting businesses who want to use their TPUs on GCP.

      there are few groups as talented at losing a head start as google.

      1 reply →

    • > It would take talent for them to mess up hosting businesses who want to use their TPUs on GCP. > But then again even there, their reputation for abandoning products

      What are the chances of abandoning TPU-related projects where the company literally invested billions in infrastructure? Zero.

      4 replies →

Google will win the LLM game if the LLM game is about compute, which is the common wisdom and maybe true, but not foreordained by God. There's an argument that if compute was the dominant term that Google would never have been anything but leading by a lot.

Personally right now I see one clear leader and one group going 0-99 like a five sigma cosmic ray: Anthropic and the PRC. But this is because I believe/know that all the benchmarks are gamed as hell, its like asking if a movie star had cosmetic surgery. On quality, Opus 4 is 15x the cost and sold out / backordered. Qwen 3 is arguably in next place.

In both of those cases, extreme quality expert labeling at scale (assisted by the tool) seems to be the secret sauce.

Which is how it would play out if history is any guide: when compute as a scaling lever starts to flatten, you expert label like its 1987 and claim its compute and algorithms until the government wises up and stops treating your success persobally as a national security priority. It's the easiest trillion Xi Xianping ever made: pretending to think LLMs are AGI too, fast following for pennies on the dollar, and propping up a stock market bubble to go with the fentanyl crisis? 9-D chess. It's what I would do about AI if I were China.

Time will tell.

  • I believe Google might win the LLM game simply because they have the infrastructure to make it profitable - via ads.

    All the LLM vendors are going to have to cope with the fact that they're lighting money on fire, and Google have the paying customers (advertisers) and with the user-specific context they get from their LLM products, one of the juciest and most targetable ad audiences of all time.

    • This is one of the best insights after reading 100+ posts here. You are talking about existing demand and existing relationships from their advert business. These customers are happy with Google results. The Google marketing team will be carefully defining new advert products that employ LLMs.

      I would only offer one disagreement with your post: There will not be a single winner in LLMs. The landscape is so large that we will have multiple winners in different areas. Example: Google might fail in B2C LLM (chatbot that answers your questions), but will certainly be (wildly?) successful in B2B for adverts.

  • Everyone seems to forget about Mu Zero which was arguably more important than transformer architecture.

Yeah honestly. They could just try selling solutions and SLAs combining their TPU hardware with on-prem SOTA models and practically dominate enterprise. From what I understand, that's GCP's gameplay too for most regulated enterprise clients.

  • Googles bread and butter is advertising, so they have a huge interest in keeping things in house. Data is more valuable to them than money from hardware sales.

    Even then, I think that their primary use case is going to be consumer grade good AI on phones. I dunno why Gemma QAT model fly so low on the radar, but you can basically get full scale Llamma 3 like performance from a single 3090 now, at home.

    • It’s my understanding that google makes bulk of ad money from search ads - sure they harvest a ton of data but it isn’t as valuable to them as you’d think. I suspect they know that could change so they’re hoovering up as much as they can to hedge their bets. Meta on the other hand is all about targeted ads.

      1 reply →

  • Relenting hardware like that would be such a cleansing old-school revenue stream for Google... just imagine...

Hasn’t the Inferentia chip been around long enough to make the same argument? AWS and Google probably have the same order of magnitude of their own custom chips

But they’re ASICs so any big architecture changes will be painful for them right?

  • TPUs are accelerators that accelerate the common operations found in neural nets. A big part is simply a massive number of matrix FMA units to process enormous matrix operations, which comprises the bulk of doing a forward pass through a model. Caching enhancements and massively growing memory was necessary to facilitate transformers, but on the hardware side not a huge amount has changed and the fundamentals from years ago still powers the latest models. The hardware is just getting faster and with more memory and more parallel processing units. And later getting more data types to enable hardware-enabled quantization.

    So it isn't like Google designed a TPU for a specific model or architecture. They're pretty general purpose in a narrow field (oxymoron, but you get the point).

    The set of operations Google designed into a TPU is very similar to what nvidia did, and it's about as broadly capable. But Google owns the IP and doesn't pay the premium and gets to design for their own specific needs.

    • There are plenty of matrix multiplies in the backward pass too. Obviously this is less useful when serving but it's useful for training.

  • I'd think no. They have the hardware and software experience, likely have next and next-next plans in place already. The big hurdle is money, which G has a bunch of.