← Back to context

Comment by derefr

6 months ago

If your code is plain C, then anyone can extend it with, or embed it inside, code of literally any other language; and in so doing, they will have full access/exposure to everything in your codebase — all the same stuff that they would if they were writing their host/extension code in C.

This is not true of C++ (or most other languages):

• C++ has a runtime (however minimal); and so, by including any C++ code in a codebase, you're making it much more difficult to link/embed the resulting code — you now have to also dynamically link the C++ runtime, and ensure that your host code spins it up "early", before any of the linked C++ code gets to run. (This may even be impossible in some host languages!)

• Also, even if there was no associated runtime to deal with, C++ isn't wholly C-FFI-clean. All the stuff that people like about C++ — all the reasons you'd want to use C++ — result in codebases that aren't cleanly C-FFI exposable, due to name mangling, functions taking parameters with non-C-exportable types, methods + closures not being C-FFI thunkable [and functions returning those], etc.

• And even if you bite that bullet, and write your library in C++ but carefully wrap its API to give it C-FFI-clean linkage (usually via a hybrid C / C++ project), this still introduces a layer of FFI runtime overhead. When another non-C language consumes your code, it's then getting double FFI overhead — a call from its code to yours has to convert from its abstractions, to C's abstractions, to C++'s abstractions, and back. (This is why you don't tend to see e.g. non-C++ projects embedding LLVM, or LLVM being extended with non-C++ passes, despite LLVM being designed in this "C wrapper around a C++ core" style.)

C is one of the only languages with a zero-impedance-mismatch, zero-overhead default or forced binding of external symbols to the C FFI (i.e. the C set of platform ABIs + C symbol naming standard.)

The others that do this are: C3 (https://c3-lang.org/); Zig, unless you do weird things on purpose, and... that's really it. Everything else has the same two problems as C++ outlined above.

Even Rust, even Odin, etc. only provide C-FFI linkage as an opt-in feature; and they do nothing to incentivize use of it; and so, of course, due to their useful non-C-FFI-clean features, developers are disincentivized from ever enabling it before they "need" it. So in practice, most libraries in those languages are not consumable from C [or other C-FFI-compatible languages] — and most software in those languages are not extendable in C [or another C-FFI-compatible language] — without extra effort on the upstream's part to add explicit support for doing that. And most upstreams don't bother.

Writing software in C itself, is essentially a way for a project to "tie itself to the mast" and commit to its ABI always being C-FFI clean; such that it can be consumed not only from C, but also from any other language a project might use that supports importing C-FFI libraries. (Which is most languages.)

C also has a (granted very small on Unix) runtime[1]. On windows the C runtime is a bit larger, since windows processes get a single string for their arguments, which must be parsed into argc/argv.

As far as "you have to also dynamically link the C++ runtime", try calling malloc from two different libc implementations in the same process and see "interesting" things happen. Even more interesting is calling free() on a pointer that was malloc'd from a different C library.

1: E.g. for musl https://git.musl-libc.org/cgit/musl/tree/crt

C++ has a runtime (however minimal);

No it doesn't.

Also, even if there was no associated runtime to deal with, C++ isn't wholly C-FFI-clean

Yes it is, you just extern "C" whatever you want.

All the stuff that people like about C++ — all the reasons you'd want to use C++ — result in codebases that aren't cleanly C-FFI exposable

Not true at all, the biggest two things, destructors and move semantics you still have everywhere except for the boundaries with C.

And even if you bite that bullet, and write your library in C++ but carefully wrap its API to give it C-FFI-clean linkage (usually via a hybrid C / C++ project), this still introduces a layer of FFI runtime overhead

There is no overhead here, it is not different from C.

I don't know where all this comes from, but I doubt it comes from heavy experience with modern C++.

> C++ has a runtime (however minimal)

I'm not familiar with this, are you able to explain it? Do you mean something analogous to _start?

  • It's the stuff you find in LLVM's libc++abi (https://libcxxabi.llvm.org/spec.html) or GCC's libsupc++ (https://wiki.osdev.org/Libsupcxx).

    • These libraries provide the set of functions that compiled C++ code expects to call into to implement syntax-level features, like exception unwinding, destructors, and runtime type inference (RTTI).

    • These libraries also define the C++ equivalents of the C runtime's _init and _fini functions. The C++ versions do a lot more: they set up the exception unwinder (copying and fixing-up .data-section DWARF tables into exception-unwind lookup tables for the active .text-section address-space mappings); they register global destructors; they register signal handlers (and per-thread-create handlers to re-register those signal handlers), to make the particular signal received by https://man7.org/linux/man-pages/man3/pthread_cancel.3.html trigger exception-unwind and destructors; ...etc.

    Usually, a C++ runtime support library gets static-linked into a C++ stdlib; and then a program implicitly receives linkage to the C++ runtime support library as part of statically or dynamically linking to the C++ stdlib.

    However, if you want a "minimal" C++ binary — and you never use anything in `std` (maybe because you're just using C++ to be able to write "C with templates") — then you can just link only the C++ runtime support library itself.

    (And yes, all of these C++ syntax features that require runtime support are technically optional. You can just `-fno-exceptions -fno-rtti -ffreestanding -fno-asynchronous-unwind-tables -fno-unwind-tables`, and then never link a C++ runtime support library at all — if you're willing to only use placement `new` and never use destructors or exceptions or RTTI. But at that point you're not exactly writing C++; you're writing an "embedded profile" of C++, akin to targeting the Java SE Embedded platform. Which makes sense if you're coding e.g. an RTOS kernel; but goes against the much of the point if you're just writing e.g. the "native loadable module" for an HLL ecosystem package. Now your code can't depend on arbitrary third-party C++ code [because there isn't an "embedded C++" ecosystem, only a regular C++ ecosystem]; you can't find any FOSS contributors willing to help you maintain it [because 99% of people only know C++, not "embedded C++"]; and so on.)