Comment by mananaysiempre
8 months ago
> This kind of "expected next field" optimization has a long history in protobuf
You could probably even trace the history of the idea all the way to Van Jacobson’s 30-instruction TCP fastpath[1]. Or to go a bit closer, I’ve found that an interpreter for a stack+accumulator VM (which, compared to the pure stack option, is prone to blowing up the bytecode count and thus dispatch cost with the constant PUSH-accumulator instructions) goes significantly faster if you change the (non-shared) dispatch from
return impl[*pc](pc, ...);
to
if (*pc == PUSH) {
do_push(...); pc++;
}
return impl[*pc](pc, ...);
which feels somewhat analogous to the next-field optimization and avoids polluting the indirect branch predictor with the very common PUSH predictions. (It’s still slower than not having those PUSHes in the first place.)
[1] https://www.pdl.cmu.edu/mailinglists/ips/mail/msg00133.html
No comments yet
Contribute on Hacker News ↗