← Back to context

Comment by davedx

7 days ago

But they’re ASICs so any big architecture changes will be painful for them right?

TPUs are accelerators that accelerate the common operations found in neural nets. A big part is simply a massive number of matrix FMA units to process enormous matrix operations, which comprises the bulk of doing a forward pass through a model. Caching enhancements and massively growing memory was necessary to facilitate transformers, but on the hardware side not a huge amount has changed and the fundamentals from years ago still powers the latest models. The hardware is just getting faster and with more memory and more parallel processing units. And later getting more data types to enable hardware-enabled quantization.

So it isn't like Google designed a TPU for a specific model or architecture. They're pretty general purpose in a narrow field (oxymoron, but you get the point).

The set of operations Google designed into a TPU is very similar to what nvidia did, and it's about as broadly capable. But Google owns the IP and doesn't pay the premium and gets to design for their own specific needs.

  • There are plenty of matrix multiplies in the backward pass too. Obviously this is less useful when serving but it's useful for training.

I'd think no. They have the hardware and software experience, likely have next and next-next plans in place already. The big hurdle is money, which G has a bunch of.