Comment by ethmarks

4 hours ago

> TPUs are specifically designed to handle the massive computations involved in training LLMs and can speed up training considerably compared to CPUs.

That seems like a low bar. Who's training frontier LLMs on CPUs? Surely they meant to compare TPUs to GPUs. If "this is faster than a CPU for massively parallel AI training" is the best you can say about it, that's not very impressive.

5 comments

ethmarks

babl-yc 3 hours ago

I don't know if you can generally say that "LLM training is faster on TPUs vs GPUs". There is variance among LLM architectures, TPU cluster sizes, GPU cluster sizes...

They are both designed to do massively parallel operations. TPUs are just a bit more specific to matrix multiply+adds while GPUs are more generic.

Workaccount2 4 hours ago

It's a typo

ethmarks 4 hours ago
Does Google's team not proofread this stuff? Or maybe is this an early draft that wasn't meant to be released?
- camdenreslink 4 hours ago
  
  It was generated by an LLM like everything else these days.
- silveraxe93 3 hours ago
  
  This is a leak, yeah.
  Though come on... Even with proofreading, this is an easy one to miss.