Comment by mikemike
9 years ago
Actually, LuaJIT 1.x is just that: a translator from a register-based bytecode to machine code using templates (small assembler snippets) with fixed register assignment. There's only a little bit more magic to that, like template variants depending on the inferred type etc.
You can compare the performance of LuaJIT 1.x and 2.0 yourself on the benchmark page (for x86). The LuaJIT 1.x JIT-compiled code is only slightly faster than the heavily tuned LuaJIT 2.x VM plus the 2.x interpreter written in assembly language by hand. Sometimes the 2.x interpreter even beats the 1.x compiler.
A lot of this is due to the better design of the 2.x VM (object layout, stack layout, calling conventions, builtins etc.). But from the perspective of the CPU, a heavily optimized interpreter does not look that different from simplistic, template-generated code. The interpreter dispatch overhead can be moved to independent dependency-chains by the CPU, if you're doing this right.
Of course, the LuaJIT 2.x JIT compiler handily beats both the 2.x interpreter and the 1.x compiler.
HN is an astonishing thing!
Article: "We can also refute Bernstein’s argument from first principles: the kind of people who can effectively hand-optimize code are expensive and not incredibly plentiful."
Commenter: "IMO he couldn't give a convincing answer to the guy who asked about LuaJIT author being out of a job."
Guy in audience: "I was that guy in the audience."
LuaJIT author: "Actually, LuaJIT 1.x is just that"
Voice in my head: "Aspen 20, I show you at one thousand eight hundred and forty-two knots, across the ground."
Meta: Apologies for the abstract response, but I couldn't figure out a better way to present the parallel. It can be hard to explain artistic allusions without ruining them. What I mean to say is that this pattern of responses reminded me in a delightful way of the classic story of the SR-71 ground speed check: http://www.econrates.com/reality/schul.html
I'm impressed (as usual with your work) that you're able to get that level of performance from an interpreter, although as you note it's not an apples-to-apples comparison. I wonder what you'd get from the 2.x VM design with a templating JIT compiler.
But, as you said, my main point is your last sentence -- optimizing compilers like LuaJIT 2.x are impressive and necessary.