Comment by oivey
16 hours ago
The original post made no comments about inference or training or even cost in any way. It said you could hook up more TPUs together with more memory and higher average bandwidth than you could with a datacenter of Nvidia GPUs. From an architectural point of view, it isn’t clear (and is not explained) what that enables. It clearly hasn’t led to a business outcome for Google where they are the clear market leader.
Seemingly fast interconnects benefit training more than inference since training can have more parallel communication between nodes. Inference for users is more embarrassingly parallel (requires less communication) than updating and merging network weights.
My point: cool benchmark, what does it matter? The original post says Nvidia doesn’t have anything to compete with massively interconnected TPUs. It didn’t merely say Google’s TPUs were better. It said that Nvidia can’t compete. That’s clearly bullshit and wishful thinking, right? There is no evidence in the market to support that, and no actual technical points have been presented in this thread either. OpenAI, Anthropic, etc are certainly competing with Google, right?
The fact that NVidia are currently winning is undisputed.