Comment by famouswaffles
7 hours ago
Almost certainly not if things remain as they are. The reason there's been little traction is the quality gap between diffusion and autoregressive models is pretty stark. I mean just look at the benchmarks here. Large dropoffs, with the hardest benchmarks seeing the largest drops. On top of that, almost all the speed benefits of diffusion models become negated at scale. So this is only attractive for local model development and almost everyone training local models still care about pound for pound quality and inference efficiency at scale.
It's fast enough that "ask it twice and pick the best" should still come out ahead performance-wise. I don't know how much that would close the quality gap by, but it's worth a play.