Comment by anonym29

15 hours ago

I have never understood why, in these discussions, nobody brings up other specialized silicon providers like Groq, SambaNova, or my personal favorite, Cerebras.

Cerebras CS-3 specs:

• 4 trillion transistors

• 900,000 AI cores

• 125 petaflops of peak AI performance

• 44GB on-chip SRAM

• 5nm TSMC process

• External memory: 1.5TB, 12TB, or 1.2PB

• Trains AI models up to 24 trillion parameters

• Cluster size of up to 2048 CS-3 systems

• Memory B/W of 21 PB/s

• Fabric B/W of 214 Pb/s (~26.75 PB/s)

Comparing GPU to TPU is helpful for showcasing the advantages of the TPU in the same way that comparing CPU to Radeon GPU is helpful for showcasing the advantages of GPU, but everyone knows Radeon GPU's competition isn't CPU, it's Nvidia GPU!

TPU vs GPU is new paradigm vs old paradigm. GPUs aren't going away even after they "lose" the AI inference wars, but the winner isn't necessarily guaranteed to be the new paradigm chip from the most famous company.

Cerebras inference remains the fastest on the market to this day to my knowledge due to the use of massive on-chip SRAM rather than DRAM, and to my knowledge, they remain the only company focused on specialized inference hardware that has enough positive operating revenue to justify the costs from a financial perspective.

I get how valuable and important Google's OCS interconnects are, not just for TPUs or inference, but really as a demonstrated PoC for computing in general. Skipping the E-O-E translation in general is huge and the entire computing hardware industry would stand to benefit from taking notes here, but that alone doesn't automatically crown Google the victor here, does it?

0 comments

anonym29

No comments yet

Contribute on Hacker News ↗