Comment by Der_Einzige

18 days ago

Google did not use TPUs for literally every bit of compute that led to Gemini. GCP has millions of high end Nvidia GPUs and programming for them is an order of magnitude easier, even for googlers.

Any claim from google that all of Gemini (including previous experiments) was trained entirely by TPUs is lies. What they are truthfully saying is that the final training run was done on all TPUs. The market shouldn’t react heavily to this, but instead should react positively to the fact that google is now finally selling TPUs externally and their fab yields are better than expected.

4 comments

Der_Einzige

djsjajah 18 days ago

> including all previous experiments

How far back do you go? What about experiments into architecture features that didn’t make the cut? What about pre-transformer attention?

But more generally, why are you so sure that they team that built Gemini didn’t exclusively use TPUs while they were developing it?

I think that one of the reasons that Gemini caught up so quickly is because they have so much compute at fraction of the price of everyone else.

notyourwork 18 days ago

Why should it not react heavily? What’s stopping this from being a start of a trend for google and even Amazon?

imtringued 18 days ago

JAX is very easy to use. Give it a try.

gregorygoc 18 days ago

They are not lies.