← Back to context

Comment by inkyoto

10 months ago

Wasting a register on comparatively more modern ISA's (PA-RISC 2.0, MIPS64, POWER, aarch64 etc – they are all more modern and have an abundance of general purpose registers) is not a concern.

The actual «wastage» is in having to generate a prologue and an epilogue for each function – 2x instructions to preserve the old frame pointer and set a new one up, and 2x instruction at the point of return – to restore the previous frame pointer.

Generally, it is not a big deal with an exception of a pathological case of a very large number of very small functions calling each other frequently where the extra 4x instructions per each such a function will be filling up the L1 instruction cache «unnessarily».

Those pathological cases are really what inlining is for, with the exception of any tiny recursive functions that can't be tail call optimised.

  • Yes, inlining (and LTO can take it a notch or two higher) does away with the problem altogether, however the number of projects that default to «-Os» (or even to «-O2») to build a release product is substantial to large.

    There is also a significant number of projects that go to great lengths to force-override CFLAGS/CXXFLAGS (usually with «-O2 -g» or even with «-O») or make it extraordinary difficult to change the project's default CFLAGS, for no apparent reasons which eliminates a number of advanced optimisations in builds with default build settings.