Comment by qcnguy

3 months ago

Not vibes. TPUs have fallen behind or had to be redesigned from scratch many times as neural architectures and workloads evolved, whereas the more general purpose GPUs kept on trucking and building on their prior investments. There's a good reason so much research is done on Nvidia clusters and not TPU clusters. TPU has often turned out to be over-specialized and Nvidia are pointing that out.

9 comments

qcnguy

pests 3 months ago

You say that like I d a bad thing. Nvidia architectures keep changing and getting more advanced as well, with specialized tensor operations, different accumulators and caches, etc. I see no issue with progress.

oivey 3 months ago
That’s missing the point. Things like tensor cores were added in parallel with improvements to existing computer and CUDA kernels from 10 years ago generally run without modification. Hardware architecture may change, but Nvidia has largely avoided changing how you interact with it.
- saagarjha 3 months ago
  
  Modern CUDA programs that hit roofline look absolutely nothing like those from 10 or even 5 years ago. Or even 2 if you’re on Blackwell.
  
  3 replies →
- kllrnohj 3 months ago
  
  And yet current versions of Whisper GPU will not run on my not-quite-10-year old Pascal GPU anymore because the hardware CUDA version is too old.
  Just because it's still called CUDA doesn't mean it's portable over a not-that-long of a timeframe.
  
  1 reply →

charleshn 3 months ago

> There's a good reason so much research is done on Nvidia clusters and not TPU clusters.

You are aware that Gemini was trained on TPU, and that most research at Deepmind is done on TPU?