← Back to context

Comment by pmb

9 hours ago

At this point, when you are doing big AI you basically have to buy it from NVidia or rent it from Google. And Google can design their chips and engine and systems in a whole-datacenter context, centralizing some aspects that are impossible for chip vendors to centralize, so I suspect that when things get really big, Google's systems will always be more cost-efficient.

(disclosure: I am long GOOG, for this and a few other reasons)

I'd go long Google too if using Gemini CLI felt anything close to the experience I get with Codex or Claude. They might have great hardware but it's worthless if their flagship coding agent gets stuck in loops trying to find the end of turn token.

  • Gemini CLI isn't a great product unfortunately. While it's unfortunately tied to a GUI, antigravity is a far superior agent harness. I suggest comparing that to Claude code instead.

    • Bad software kills good hardware.

      And the converse is true also. I mean, look at NVIDIA. For the longest time they were just a gaming card company, competing with AMD. I remember alternating between the two companies for my custom builds in the 90s and it basically came down to rendering speed and frame rate.

      But Jensen bet on the "compute engine" horse and pushed CUDA out, which became the defacto standard for doing fast, parallel arithmetic on a GPU. He was able to ride the BitCoin wave and then the big one, DNNs. AMD still hasn't caught on yet (despite 15 years having gone by).

      1 reply →

    • I wish it were otherwise but antigravity is also a distant third behind codex cli/app, and claude code.

      3.1 pro is just fundamentally not on the same level. In any context I've tried it in, for code review it acts like a model from 1yr ago in that it's all hallucinated superficial bullshit.

      Claude code is significantly less likely to produce the same (yet still does a decent amount). Gpt 5.4 high/xhigh is on another level altogether - truly not comparable to Gemini.

  • I use Claude Code all day and use Gemini CLI for personal projects and I don't see the huge gap that other people seem to talk about a lot. Truthfully there are parts of Gemini CLI I like better than Claude Code.

    • I don't use Gemini CLI- I use the extension in VSCode, and Gemini extension in VS Code is barely usable in comparison to Claude or GPT-5.4. My experience (consistent with a lot of other reports) is that it takes long time before answer, and frequently returns errors (after a long wait). But I think it's specific to the extension (and maynbe the CLI) because the web version of Gemini works quickly and rarely errors (for me).

    • There was still a big gap like, 6 months ago. Now, I'm not seeing it either. It's been working well the last couple weeks after I picked it up again.

  • Of the big three, Gemini gives me the worst responses for the type of tasks I give it. I haven’t really tried it for agentic coding, but the LLM itself often gives, long meandering answers and adds weird little bits of editorializing that are unnecessary at best and misleading at worst.

    • Same. The tone is really off. Here is a response I just got from Gemini 3.1: "Your simulation results are incredibly insightful, and they actually touch on one of the most notoriously difficult aspects of ..." It's pure bullshit, my simulation results are in fact broken, GPT spotted it immediately.

Isn't Amazon doing the same thing, making their own TPU's?

  • Yeah trainium and inferentia. They’re just not nearly as well supported on the software level. Google has already made sure this new generation will be supported by vllm, sglang, etc. Amazons chips barely support those and only multiple versions back. Super under invested in (at least on the open source side)

    • That's seems odd. I'd figure if they are going to sell it as a product in AWS that they'd have some sort of off the shelf tooling that would be available.

I'd bet that too if their management wasn't so incredibly uninspiring. Like, Apple under Cook was also pretty mild and a huge step down from Jobs, but Google feels like it fell off a cliff. If it wasn't for OpenAI releasing ChatGPT, they might still be sitting on that tech while only testing it internally. Now it drives their entire chip R&D.

  • To be fair, I don't think any of the AI players wanted what OAI did. Sam grabbed first mover at the cost of this insane race everyone else got forced into.

  • Google was calling itself an "AI-first" company beginning in 2016 or 2017. They designed and built TPUs nearly a decade ago and were using transformer models in products like Google Translate but didn't make a big fuss about it, it just made the product way better. People should at least credit Sundar somewhat for this, it turned out to be quite prescient, especially the advantage of having your own chips that are specifically designed for ML.

  • I am not fan of the era when CEO is expected to be a cult leader type person.

    Cook did very well in all areas as well as in not trying to create a cult.

  • They had no reason to destroy their golden goose, why release something that could hurt their money printing business.

    Honestly im rather impressed with how they handled it, they had enough of the infra and org in place to jump at it once the cat was out of the bag.

    Sundar declared a code red or whatever and they made it happen. But that could ONLY happen if they had the bedrock of that ability already built.

    No one really remembers now that google was a year behind.

> I suspect that when things get really big, Google's systems will always be more cost-efficient.

In fact I am opposite of this hypothesis for two reasons. Google has artificially limited production. And because TSMC favours whoever could pay for the most capacity(as incremental capacity is very cheap for them). So Nvidia gets first slot for new process.

Also the second reason is that GCP's operating margin is very high compared to say Hetzner or lambdalabs and you can get GPUs much cheaper there compared to GCP. So students/small researchers are stuck on GPU.