Comment by xnx

3 days ago

Is this newer/better than the speculative decoding from 2022? https://arxiv.org/abs/2211.17192

2 comments

xnx

Seems like they focus on improving the drafter and the verification policy so speculation keeps producing net speedups rather than wasted verification work at deepseek scale.

alok-g 3 days ago

That paper is cited in the 'introduction' and 'background' sections. This paper is improving by removing some bottlenecks.