Comment by hinkley
1 year ago
Instead of a for loop where you hit a switch at the top of the loop, you jump to the code block for next instruction from the end of the previous instruction. That stops you from jumping to the top of the loop and then jumping a second time.
On older CPUs, you’re less likely to hit a pipeline stall. This technique was called “super-pipelining” for that reason. But a few years ago when this was discussed, someone pointed out that’s usually not necessary anymore. That branch predictors can see through double jumps now.
But as I alluded to in another reply, CPU caches are finite, and I have doubts whether in a fully realized interpreter, particularly one living side by side with a JIT, if that microbenchmark is lying to you about how fast the inner loop is under production conditions.
No comments yet
Contribute on Hacker News ↗