← Back to context

Comment by spmurrayzzz

5 hours ago

Near-term acquihires are certainly a likely bet I think. But given model progress on related benchmarks like kernelbench [1], I do think a set of more commoditized solutions is also inevitable.

The caveat though is that each new gen of hardware often comes with brand new constraints/features that a given generation of models haven't seen before (e.g. tcgen05 in blackwell was OOD at one point). As the models start to generalize better, this might not be a showstopper, but still an issue at least currently.

[1] https://kernelbench.com/

Thank you for sharing the link. It's fascinating that the models can only do 10-20% on the hard subset, and I wonder why that is so. The fact that they can only get 30-40% out of the fp8 GEMM seems unintuitive to me, I would've expected a convergence near ~80%.