Comment by DannyBee
10 years ago
Let's take it piece by piece
" * Diamond-shaped control-flow is known to be the worst-case scenario for most optimizations and for register alloction. Nested diamond-shaped control-flow is even worse. "
Without more info, i can't possibly agree or disagree with this.
Diamonds are definitely not the worst case for register allocations?
" * The compiler doesn't have enough hints to see what the fast paths and what the slow paths are. Even if you'd be able to tell it, it still sees this as a single giant control-flow graph."
With profiling info, dynamic or static, this is wrong.
" Anything in this loop could potentially influence anything else, so almost nothing can be hoisted or eliminated. The slow paths kill the opportunities for the fast paths and the complex instructions kill the opportunities for the simpler instructions."
This is not accurate, First it assumes no optimizations or alias analysis is path sensitive, which is false.
Second, even in those cases where it is true, a smart compiler will simply insert runtime aliasing tests for what it can't prove. They do this already. A JIT (again, another type of compiler, just not static) would not even need this.
Let's skip his hand coding and get to what he says the advantages are:
"* Keep a fixed register assignment for all instructions.
* Keep everything in registers for the fast paths. Spill/reload only in the slow paths.
* Move the slow paths elsewhere, to help with I-Cache density."
All of these things are done by optimizing compilers, already, given profiling info.
If his statement about what he thinks the advantages are is accurate , and what we do, i'm 100% positive that GCC + FDO either already does, or could be tuned without a ton of work, to beat Mike Pall for this kind of loop.
> If his statement about what he thinks the advantages are is accurate , and what we do, i'm 100% positive that GCC + FDO either already does, or could be tuned without a ton of work, to beat Mike Pall for this kind of loop.
I love the gauntlet-throwing fire in this statement, and I only wish that it was practical to arrange for a showdown on this to actually happen. Especially since it would do nothing but improve the state of the art on both sides, and be a great case study in the capabilities and limitations of compilers.
Unfortunately I think it is impractical to actually have an apples-to-apples showdown because of the fact that the two interpreters use different byte-code. Writing a pure-C interpreter for LuaJIT would be non-trivial.
> I love the gauntlet-throwing fire in this statement, and I only wish that it was practical to arrange for a showdown on this to actually happen.
Yes, but it's already available in a simpler case, as eluded to above and in discussion on the preview of this talk. There wasn't an answer to the question about which compiler is competitive with OpenBLAS on the linpack benchmark in Fortran (dominated by the triply-nested matrix multiplication loop).
A typical computational chemistry program of the type that consumes many cycles on HPC systems might use three types of kernel in assembler for good reason, e.g. OpenBLAS, FFTW, and ELPA (scalable eigenvalues). This is unfortunate, but at least a few people doing the work can have a large benefit; many thanks to them.
I would love to see the results. I suggest contacting him either directly or through the mailing list for a representative sample of a C interpreter that can be benchmarked against luajit. http://luajit.org/contact.html
To be frank, if it wasn't interesting enough for him to ever file a bug against a compiler for, and he's already got something he likes, i don't see why i should take the time to produce results? I mean, what would the point be? He's obviously happy with what he's got (and more power to him).
Even if i did it, outside of supporting an argument on the internet, what changes?
Today people rely on sources like Mike's mailing list post to decide whether their project can be done in C or not. If GCC significantly improved since 2011, it's important that people are able to show evidence of this.