← Back to context

Comment by quotemstr

2 years ago

The C calling convention kind of sucks. True, can't change the C calling convention, but that doesn't make it any less unfortunate.

We should use every available caller-saved register for arguments and return values, but in the traditional SysV ABI, we use only one register (sometimes two) for return values. If you return a struct Point3D { long x, y, z }, you spill the stack even though we could damned well put Point3D in rax, rdi, and rsi.

There are other tricks other systems use. For example, if I recall correctly, in SBCL, functions set the carry flag on exit if they're returning multiple values. Wouldn't it be nice if we used the carry flag in indicate, e.g. whether a Result contains an error.

"sucks" is a strong word but with respect to return values, you're right. The C calling conventions, everywhere really, support what C supports - returning one argument. Well, not even that (struct returns ... nope). Kind of "who'd have thought" in C I guess. And then there's the C++ argument "just make it inline then".

On the other hand, memory spills happen. For SPARC, for example, the gracious register space (windows) ended up with lots of unused regs in simple functions and a cache-busting huge stack size footprint, definitely so if you ever spilled the register ring. Even with all the mov in x86 (and there is always lots of it, at least in compiled C code) to rearrange data to "where it needed to be", it often ended up faster.

When you only look at the callee code (code generated for a given function signature), it's tempting to say "oh it'll definitely be fastest if this arg is here and that return there". You don't know the callers though. There's no guarantee the argument marshalling will end up "pass through" or the returns are "hot" consumed. Say, a struct Point { x: i32, y: i32, z: i32 } as arg/return; if the caller does something like mystruct.deepinside.point[i] = func(mystruct.deepinside.point[i]) in a loop then moving it in/out of regs may be overhead or even prevent vectorisation. But the callee cannot know. Unless... the compiler can see both and inline (back to the C++ excuse). Yes, for function call chaining javascript/rust style it might be nice/useful "in principle". But in practice only if the compiler has enough caller/callee insight to keep the hot working set "passthrough" (no spills).

The lowest hanging fruit on calling is probably to remove the "functions return one primitive thing" that's ingrained in the C ABIs almost everywhere. For the rest ? A lot of benchmarking and code generation statistics. I'd love to see more of that. Even if it's dry stuff.

  • > Well, not even that (struct returns ... nope).

    C compilers actually pack small struct return values into registers:

    https://godbolt.org/z/51q5se86s

    It's just limited that on x86-64, GCC and Clang use up to two registers while MSVC only uses one.

    Also, IMHO there is no such thing as a "C calling convention", there are many different calling conventions that are defined by the various runtime environments (usually the combination of CPU architecture and operating system). C compilers just must adhere to those CPU+OS calling conventions like any other language that wants to interact directly with the operating system.

    IMHO the whole performance angle is a bit overblown though, for 'high frequency functions' the compiler should inline the function body anyway. And for situations where that's not possible (e.g. calling into DLLs), the DLL should expose an API that doesn't require such 'high frequency functions' in the first place.

    • > Also, IMHO there is no such thing as a "C calling convention", there are many different calling conventions [ ... ]

      I did not say that. I said "C calling conventions" (plural). Rather aware of the fact that the devil is in the detail here ... heck, if you want it all, back in the bad old days, even the same compiler supported/used multiple ("fastcall" & Co, or on Win 3.x "pascal" for system interfaces, or the various ARM ABIs, ...).

      1 reply →