← Back to context

Comment by peter_d_sherman

2 months ago

>"Traditional autoregressive language models generate text one word – or token – at a time. This sequential process can be slow, and limit the quality and coherence of the output.

Diffusion models work differently. Instead of predicting text directly, they learn to generate outputs by refining noise, step-by-step. This means they can iterate on a solution very quickly and error correct during the generation process."

It would seem that diffusion / noise filtering / processing -- would be more parallelizable than traditional autoregressive AI large language models...

That might be an interesting area of study... the parallelization of such algorithms...