← Back to context

Comment by onlyrealcuzzo

8 days ago

Nvidia seems cooked.

Google is crushing them on inference. By TPUv9, they could be 4x more energy efficient and cheaper overall (even if Nvidia cuts their margins from 75% to 40%).

Cerebras will be substantially better for agentic workflows in terms of speed.

And if you don't care as much about speed and only cost and energy, Google will still crush Nvidia.

And Nvidia won't be cheaper for training new models either. The vast majority of chips will be used for inference by 2028 instead of training anyway.

Nvidia has no manufacturing reliability story. Anyone can buy TSMC's output.

Power is the bottleneck in the US (and everywhere besides China). By TPUv9 - Google is projected to be 4x more energy efficient. It's a no-brainer who you're going with starting with TPUv8 when Google lets you run on-prem.

These are GW scale data centers. You can't just build 4 large-scale nuclear power plants in a year in the US (or anywhere, even China). You can't just build 4 GW solar farms in a year in the US to power your less efficient data center. Maybe you could in China (if the economics were on your side, but they aren't). You sure as hell can't do it anywhere else (maybe India).

What am I missing? I don't understand how Nvidia could've been so far ahead and just let every part of the market slip away.

> let every part of the market slip away.

Which part of the market has slept away, exactly ? Everything you wrote is supposition and extrapolation. Nvidia has a chokehold on the entire market. All other players still exist in the small pockets that Nvidia doesn’t have enough production capacity to serve. And their dev ecosystem is still so far ahead of anyone else. Which providers gets chosen to equip a 100k chips data center goes so far beyond the raw chip power.

  • If code is getting cheaper, making cuda alternatives and tooling should not be very far. I can’t see nvidia holding the position for much longer.

  • > Nvidia has a chokehold on the entire market.

    You're obviously not looking at expected forward orders for 2026 and 2027.

    • I think most estimates have Nvidia at more or less stable share of CoWoS capacity (around 60%), which is ~doubling in '26.

Man I hope someone drinks Nvidia's milk shake. They need to get humbled back to the point where they're desperate to sell gpus to consumers again.

Only major road block is cuda...

  • The nice thing about modern LLMs is that it's a relatively large static use case. The compute is large and expensive enough you can afford to just write custom kernels, to a degree. It's not like CUDA where running on 1, 2, 8 GPUs and you need libraries that already do it all for you, and where researchers are building lots of different models.

    There aren't all that many different small components between all of the different transformer based LLMs out there.

    • Yeah, given that frontier model training has shrunk down to a handful of labs it seems like a very solvable problem to just build the stack directly without CUDA. LLMs are mechanically simple and these labs have access to as much engineering muscle as they need. Pretty small price to pay to access cheaper hardware given that model runs cost on the order of $100M and every lab is paying Nvidia many multiples over that to fill up their new datacenters.

What puzzles me is that AMD can't secure any meaningful size of AI market. They missed this train badly.

> What am I missing?

VRAM capacity given the Cerebras/Groq architecture compared to Nvidia.

In parallel, RAM contracts that Nvidia has negotiated well into the future that other manufacturers have been unable to secure.