← Back to context

Comment by xnx

3 days ago

Is this newer/better than the speculative decoding from 2022? https://arxiv.org/abs/2211.17192

Seems like they focus on improving the drafter and the verification policy so speculation keeps producing net speedups rather than wasted verification work at deepseek scale.

That paper is cited in the 'introduction' and 'background' sections. This paper is improving by removing some bottlenecks.