Comment by thanhhaimai

18 hours ago

Opinions are my own.

I think the biggest winner of this might be Google. Virtually all the frontier AI labs use TPU. The only one that doesn't use TPU is OpenAI due to the exclusive deal with Microsoft. Given the newly launched Gen 8 TPU this month, it's likely OpenAI will contemplate using TPU too.

Many labs use TPUs, but not exclusively. Most labs need more compute than they can get, and if there's TPU capacity, they'll adapt their systems to be able to run partially on TPUs.

  • even google doesnt only use TPUs.

    • Google is in a different position to others in that they're the only frontier lab with a cloud infra business. It obviously makes sense to sell GPUs on cloud infra as people want to rent them. In that respect Google buys a ton of GPUs to rent out.

      What's unclear to me is how much Google uses GPUs for their own stuff. Yes Gemini runs on GPUs now, so that Google can sell Gemini on-prem boxes (recent release announced last week), but is any training or inference for Gemini really happening on GPUs? This is unclear to me. I'd have guessed not given that I thought TPUs were much cheaper to operate, but maybe I'm wrong.

      Caveat, I work at Google, but not on anything to do with this. I'm only going on what's in the press for this stuff.

And almost by happenstance Apple. Turns out they have a great platform for inference and torched almost nothing comparatively on Siri. The Apple/Gemini deal is interesting, Google continues to demonstrate their willingness to degrade their experience on Apple to try and force people to switch.

  • If you do the math (I did), in 2 years, open source models that you can run on a future MacBook Pro will be as capable as the frontier cloud models are today. Memory bandwidth is growing rapidly, as is the die area dedicated to the neural cores. And all the while, we have the silicon getting more power efficient and increasingly dense (as it always does). These hardware improvements are coming along as the open source models improve through research advancements. And while the cloud models will always be better (because they can make use of as much power as they want to - up in the cloud), what matters to most of us is whether a model can do a meaningful share of knowledge work for us. At the same time, energy consumption to run cloud infrastructure is out-pacing the creation of new energy supply, which is a problem not easily solved. I believe scarcity of energy will increasingly drive frontier labs toward power efficiency, which necessarily implies that the Pareto frontier of performance between cloud and local execution will narrow.

    • A Opus 4.7/Gpt5.5 class model is 5 trillion parameters[1].

      To run a 8 bit quantized version of that you need roughly 5TB of RAM.

      Today that is around 18 NVidia B300. That's around $900,000, without including the computers to run them in.

      It's true that the capability of open source models is improving, but running actual frontier models on your MPB seems a way off.

      [1] https://x.com/elonmusk/status/2042123561666855235?s=20 (and Elon has hired enough people out of those labs to have a fair idea)

      15 replies →

    • I did this calculation a bit ago and don't think frontier models are just a few MacBook Pro generations away. Yes numbers reliably go up in tech in general but in specific semiconductors & standards have long lead-times and published roadmaps, so we can have high confidence in what we're getting even in 3-4 years in terms of both transistor density and RAM speeds.

      In mid-2028 we have N2E/N2P with around 15% greater transistor density than today's N3P, and by EOY2028 we'll likely have A14 with about 35-40% density improvement.

      Meanwhile, we'll be on LPDDR6 by that point, which takes M-series Pros from 307GB/s -> ~400GB/s, and Max's from 614GB/s -> ~800GB/s.

      Model improvements obviously will help out, but on the raw hardware front these aren't in the ballpark for frontier model numbers. An H100 has 3TB/s memory bandwidth, fwiw

      2 replies →

    • So long as you don't require deep search grounding like massive web indexes or document stores which are hard to reproduce locally. You can do local agentic things that get close or even do better depending on search strategy, but theoretically a massive cloud service with huge data stores at hand should be able to produce better results.

      In practice unless you're doing some kind of deep research thing with the cloud, it'll try to optimize mostly for time and get you a good enough answer rather than spending an hour or two. An hour of cloud searching with huge data stores is not equivalent to an hour of local agentic searching, presumably.

      I think that problem will improve a little in the coming years as we kind of create optimized data curation, but the information world will keep growing so the advantage will likely remain with centralized services as long as they offer their complete potential rather than a fraction.

  • They also degrade their own direct services with little warning or thought put into change management, so, to be fair, Apple may be getting the same quality of service as the rest of us.

    • I think that's just how Google is, by nature. They don't intentionally degrade their services. They just aren't a customer centric company. They run on numbers. As a corporate, it doesn't really encourage support and maintenance work either.

  • Indeed. I'm wondering if Apple's "miss the train" with AI ended up being a blessing for them. Not only in the Google deal but also there's a lot of people doing interesting stuff locally..

  • Apple is basically in the same boat as AMD and Intel. They have a weak, raster-focused GPU architecture that doesn't scale to 100B+ inference workloads and especially struggles with large context prefill. TPUs smoke them on inference, and Nvidia hardware is far-and-away more efficient for training.

I wish Google would launch Mac Mini-like devices running their consumer-grade TPUs for local inference. I get that they don't want it to eat into their GCP margins, but it would still get them into consumer desktops that Pixel Books could never penetrate (Chromebooks don't count and may likely become obsolete soon due to MacBook Neo).

> Microsoft will no longer pay a revenue share to OpenAI. > Revenue share payments from OpenAI to Microsoft continue through 2030, independent of OpenAI’s technology progress, at the same percentage but subject to a total cap.

How is this helping OpenAI?

Dont forget Elon, i am sure this news will come up on the up and coming OpenAI vs Elon Musk trail starting soon! I cant wait to hear all the discovery from this trail

> The only one that doesn't use TPU is OpenAI

For inference? This is from July 2025: OpenAI tests Google TPUs amid rising inference cost concerns, https://www.networkworld.com/article/4015386/openai-tests-go... / https://archive.vn/zhKc4

> ... due to the exclusive deal with Microsoft

This exclusivity went away in Oct 2025 (except for 'API' workloads).

  OpenAI has contracted to purchase an incremental $250B of Azure services, and Microsoft will no longer have a right of first refusal to be OpenAI’s compute provider.

https://blogs.microsoft.com/blog/2025/10/28/the-next-chapter... / https://archive.vn/1eF0V

[flagged]

  • Some on this forum will be working for companies with conflicts of interest on the topic, and if an employees words were construed to be the opinions of the company that could be bad for that person.

    • I was once almost fired for saying a little too much in an HN comment about pentesting. Being dragged into an office and given a dressing-down for posting was quite traumatic.

      The central issue (or so they claimed) was that people might misconstrue my comment as representing the company I was at.

      So yeah, I don’t understand why people are making fun of this. It’s serious.

      On the other hand, they were so uptight that I’m not sure “opinions are my own” would have prevented it. But it would have been at least some defense.

      5 replies →

  • > Who's else would they be?

    Their employer? They may work at related company, and are required to say this.

  • At this point that phase is an attempt at status signaling.

    • it's hilarious though

      it's like people are LARPing a Fortune company CEO when they're giving their hot takes on social media

      reminds me of Trump ending his wild takes on social media with "thank you for your attention to this matter" - so out of place, it makes it really funny

      *typo

      13 replies →

  • Its to cover their ass in the event someone makes a stink and quotes them as if its a company opinion.

  • The tech companies train their employees to say this in their social media guidance and training.

In the recent Dwarkesh Podcast episode Jensen Huang (Nvidia) said that virtually nobody but Anthropic uses TPUs. How does that add up?

  • I am not sure what context Jensen said that. But midjourney uses tpu. Apple uses tpu. They are no other frontier labs that use it, but Google + Anthropic is 2 out of 3 frontier lab so.....

    You could reasonably say that "A majority of frontier labs uses TPU to train and serve their model."

    • Afaik, TPUs are only used for inference, not training. Maybe that was also what the quote referred to.

  • > How does that add up?

    He's been saying whatever is good for Nvidia for years now without any regard for truth or reason. He's one of the least trustworthy voices in the space.

    • Jensen hallucinates more than any llm, he just speaks without thinking all that much about what he says and he generalizes a lot. Trying to hold him accountable to imprecisions and gross simplifications is just going to frustrate whoever tries without changing one bit of his behavior.

  • You're asking why a businessman would downplay the use of a competing product line?

    • This is the same guy who said OpenClaw was the most important software release ever. Statements like this make me question how technically competent these tech CEOs are

      1 reply →

The only reason anyone uses a TPU is because they couldn't get the best GPUs.

  • Okay? I'm not sure where you're going with this.

    Google's TPUs have obvious advantages for inference and are competitive for training.

You think the company that just gave 40B to Anthropic is the winner? Interesting.

  • That deal is a win-win for Google. If they develop a better coding model than Anthropic and beat them at coding, then they win. If they don’t, they still win by making a ton of money from Anthropic long term.

    • Well, it's a lose for Google if all the money disappears into thin air - but I agree that it's mostly upsides for them because of how (relatively) small the investment is for this much upside.