Comment by bjt12345

7 days ago

I do wonder why diffusion models aren't used alongside constraint decoding for programming - surely it makes better sense then using an auto-regressive model.

2 comments

bjt12345

bob1029 7 days ago

Diffusion models need to infer the causality of language from within a symmetric architecture (information can flow forward or backward). AR forces information to flow in a single direction and is substantially easier to control as a result. The 2nd sentence in a paragraph of English text often cannot come before the first or the statement wouldn't make sense. Sometimes this is not an issue (and I think these are cases where parallel generation makes sense), but the edge cases are where all the money lives.

bjt12345 5 days ago

But I do wonder if diffusion models will be used in more complex Software Architecture for their long-term coherence, no exposure bias, and their symmetric architecture could work well with interaction nets.