← Back to context

Comment by milesrout

2 days ago

I think the hardest part is function calls. Implementing the standard calling convention is really fiddly and annoying when you have to save and restore registers around function calls if you want to do it efficiently (keeping track of which registers are dirty, doing parallel loads of registers into the right argument registers, spilling when appropriate, etc. It is fiddly.

The alternative is going to an IR and doing full register allocation but then you need to implement Lengauer-Tarjan to get into SSA, all the same parallel load stuff for phi functions/block arguments, out-of-SSA, reconstruct in optimisation passes all the information you discard by going to a linear IR, etc.

I'd argue that Lengauer-Tarjan is overrated. While it is neat, it's premature optimization IMHO. Fully maximal SSA (at the entry of each block insert a phi-node for each variable) + dead code elimination (we need it anyways) is much simpler.

  • Typical implementations of Lengauer-Tarjan are often taken verbatim from Andrew Appel's book and involve higher constant factors than alternative algorithms - such as Cooper et al's "engineered" version of the usual fixpoint algorithm, which is very neat, simple, and performs well.

    If you are saying that constructing SSA following the classical approach is a premature optimisation, perhaps you would prefer Braun et al's retelling of Cliff Click's SSA construction algorithm - which works backwards from uses to place phis and requires no dominance information to be computed.

    • Thank you for the reference, I'll check it.

      Why not fully maximal SSA + DCE? I mean, other than timings consideration. Fully maximal does not even need any graph traversal, it is entirely local to each block.

      4 replies →