Comment by dougall
2 hours ago
Hi, author here. My version definitely shouldn't be faster unless something very weird is going on with the runtime (though I think with the benefit of hindsight some further optimisation of it is possible). I have never seen a good use for this, aside from as a proof that it is possible, but I can imagine it coming up if, say, you wanted to write an exploit for an esoteric programming language runtime.
If you still maintain this code and want to optimise it, I don't think you should need a full powers-of-two table, just having log(n) powers of two should do in a pattern like:
if (v > 2**1024) { v *= 2**-1024; e += 1024; }
if (v > 2**512) { v *= 2**-512; e += 512; }
...
That's a straightforward memory saving and also leaves v normalised, so gives you your fraction bits with a single multiplication or division. This is a little less simple than I'm making it look, because in reality you end up moving v to near the subnormal range, or having to use a different code path if v < 1 vs if v >= 2 or something. But otherwise, yeah, the code looks good.
Thanks for the feedback, and congrats on your achievement.
We do still maintain this code, although it is deprecated now.
Even with the unrolled tests, we would still keep the table for the decoding operation, I believe. But it's true that it would at the same time provide the normalized value. That could be beneficial.