Comment by mgaunard
19 hours ago
It's fundamentally different; Rust entirely rejects the notion of a stable ABI, and simply builds everything from source.
C and C++ are usually stuck in that antiquated thinking that you should build a module, package it into some libraries, install/export the library binaries and associated assets, then import those in other projects. That makes everything slow, inefficient, and widely dangerous.
There are of course good ways of building C++, but those are the exception rather than the standard.
"Stable ABI" is a joke in C++ because you can't keep ABI and change the implementation of a templated function, which blocks improvements to the standard library.
In C, ABI = API because the declaration of a function contains the name and arguments, which is all the info needed to use it. You can swap out the definition without affecting callers.
That's why Rust allows a stable C-style ABI; the definition of a function declared in C doesn't have to be in C!
But in a C++-style templated function, the caller needs access to the definition to do template substitution. If you change the definition, you need to recompile calling code i.e. ABI breakage.
If you don't recompile calling code and link with other libraries that are using the new definition, you'll violate the one-definition rule (ODR).
This is bad because duplicate template functions are pruned at link-time for size reasons. So it's a mystery as to what definition you'll get. Your code will break in mysterious ways.
This means the C++ committee can never change the implementation of a standardized templated class or function. The only time they did was a minor optimization to std::string in 2011 and it was such a catastrophe they never did it again.
That is why Rust will not support stable ABIs for any of its features relying on generic types. It is impossible to keep the ABI stable and optimize an implementation.
It's not true that Rust rejects "the notion of a stable ABI". Rust rejects the C++ solution of freeze everything and hope because it's a disaster, it's less stable than some customers hoped and yet it's frozen in practice so it disappoints others. Rust says an ABI should be a promise by a developer, the way its existing C ABI is, that you can explicitly make or not make.
Rust is interested in having a properly thought out ABI that's nicer than the C ABI which it supports today. It'd be nice to have say, ABI for slices for example. But "freeze everything and hope" isn't that, it means every user of your language into the unforeseeable future has to pay for every mistake made by the language designers, and that's already a sizeable price for C++ to pay, "ABI: Now or never" spells some of that out and we don't want to join them.
> It'd be nice to have say, ABI for slices for example.
The de-facto ABI for slices involves passing/storing pointer and length separately and rebuilding the slice locally. It's hard to do better than that other than by somehow standardizing a "slice" binary representation across C and C-like languages. And then you'll still have to deal with existing legacy code that doesn't agree with that strict representation.
If Rust makes no progress towards choosing an ABI and decides that freezing things is bad, then Rust is de facto rejecting the notion of a stable ABI.
Rust is just a bit less than 11 years old, C++ was 13 years old when screwed up std::string ABI, so, I think Rust has a few years yet to do less badly.
Obviously it's easier to provide a stable ABI for say &'static [T] (a reference which lives forever to an immutable slice of T) or Option<NonZeroU32> (either a positive 32-bit unsigned integer, or nothing) than for String (amortized growable UTF-8 text) or File (an open file somewhere on the filesystem, whatever that means) and it will never be practical to provide some sort of "stable ABI" for arbitrary things like IntoIterator -- but that's exactly why the C++ choice was a bad idea. In practice of course the internal guts of things in C++ are not frozen, that would be a nightmare for maintenance teams - but in theory there should be no observable effect from such changes and so that discrepancy leads to endless bugs where a user found some obscure way to depend on what you'd hidden inside some implementation detail, the letter of the ISO document says your change is fine but the practice of C++ development says it is a breaking change - and the resulting engineering overhead at C++ vendors is made even worse by all the UB in real C++ software.
This is the real reason libc++ still shipped Quicksort as its unstable sort when Biden was President, many years after this was in theory prohibited by the ISO standard† Fixing the sort breaks people's code and they'd rather it was technically faulty and practically slower than have their crap code stop working.
† Tony's Quicksort algorithm on its own is worse than O(n log n) for some inputs, you should use an introspective comparison sort aka introsort here, those existed almost 30 years ago but C++ only began to require them in 2011.
> C and C++ are usually stuck in that antiquated thinking that you should build a module, package it into some libraries, install/export the library binaries and associated assets, then import those in other projects. That makes everything slow, inefficient, and widely dangerous.
It seems to me the "convenient" options are the dangerous ones.
The traditional method is for third party code to have a stable API. Newer versions add functions or fix bugs but existing functions continue to work as before. API mistakes get deprecated and alternatives offered but newly-deprecated functions remain available for 10+ years. With the result that you can link all applications against any sufficiently recent version of the library, e.g. the latest stable release, which can then be installed via the system package manager and have a manageable maintenance burden because only one version needs to be maintained.
Language package managers have a tendency to facilitate breaking changes. You "don't have to worry" about removing functions without deprecating them because anyone can just pull in the older version of the code. Except the older version is no longer maintained.
Then you're using a version of the code from a few years ago because you didn't need any of the newer features and it hadn't had any problems, until it picks up a CVE. Suddenly you have vulnerable code running in production but fixing it isn't just a matter of "apt upgrade" because no one else is going to patch the version only you were using, and the current version has several breaking changes so you can't switch to it until you integrate them into your code.
This is all wishful thinking disconnected from practicalities.
First you confuse API and ABI.
Second there is no practical difference between first and third-party for any sufficiently complex project.
Third you cannot have multiple versions of the same thing in the same program without very careful isolation and engineering. It's a bad idea and a recipe for ODR violations.
In any non-trivial project there will be complex dependency webs across different files and subprojects, and humans are notoriously bad at packaging pieces of code into sensible modules, libraries or packages, with well-defined and maintained boundaries. Being able to maintain ABI compatibility, deprecating things while introducing replacement etc. is a massive engineering work and simply makes people much less likely to change the way things are done, even if they are broken or not ideal. That's an effort you'll do for a kernel (and only on specific boundaries) but not for the average program.
> First you confuse API and ABI.
I'm not confusing API with ABI. If you don't have a stable ABI then you essentially forfeit the traditional method of having every program on the system use the same copy (and therefore version) of that library, which in turn encourages them to each use a different version and facilitates API instability by making the bad thing easier.
> Second there is no practical difference between first and third-party for any sufficiently complex project.
Even when you have a large project, making use of curl or sqlite or openssl does not imply that you would like to start maintaining a private fork.
There are also many projects that are not large enough to absorb the maintenance burden of all of their external dependencies.
> Third you cannot have multiple versions of the same thing in the same program without very careful isolation and engineering.
Which is all the more reason to encourage every program on the system to use the same copy by maintaining a stable ABI. What do you do after you've encouraged everyone to include their own copy of their dependencies and therefore not care if there are many other incompatible versions, and then two of your dependencies each require a different version of a third?
> In any non-trivial project there will be complex dependency webs across different files and subprojects, and humans are notoriously bad at packaging pieces of code into sensible modules, libraries or packages, with well-defined and maintained boundaries.
This feels like arguing that people are bad at writing documentation so we should we should reduce their incentive to write it, instead of coming up with ways to make doing the good thing easier.
I would suggest importing binaries and metadata is going to be faster than compiling all the source for that.
You'd be wrong. If the build system has full knowledge on how to build the whole thing, it can do a much better job. Caching the outputs of the build is trivial.
If you import some ready made binaries, you have no way to guarantee they are compatible with the rest of your build or contain the features you need. If anything needs updating and you actually bother to do it for correctness (most would just hope it's compatible) your only option is usually to rebuild the whole thing, even if your usage only needed one file.
"That makes everything slow, inefficient, and widely dangerous."
There nothing faster and more efficient than building C programs. I also not sure what is dangerous in having libraries. C++ is quite different though.
Of course there is. Raw machine code is the gold standard, and everything else is an attempt to achieve _something_ at the cost of performance, C included, and that's even when considering whole-program optimization and ignoring the overhead introduced by libraries. Other languages with better semantics frequently outperform C (slightly) because the compiler is able to assume more things about the data and instructions being manipulated, generating tighter optimizations.
I was talking about building code not run-time. But regarding run-time, no other language does outperform C in practice, although your argument about "better semantics" has some grain of truth in it, it does not apply to any existing language I know of - at least not to Rust which is in practice for the most part still slower than C.
ODR violations are very easy to trigger unless you build the whole thing from source, and are ill-formed, no diagnostic required (worse than UB).
Neither "ODR violations" nor IFNDR exist in C. Incompatibility across translation units can cause undefined behavior in C, but this can easily be avoided.
1 reply →
>There are of course good ways of building C++, but those are the exception rather than the standard.
What are the good ways?
Build everything from source within a single unified workspace, cache whatever artifacts were already built with content-addressable storage so that you don't need to build them again.
You should also avoid libraries, as they reduce granularity and needlessly complexify the logic.
I'd also argue you shouldn't have any kind of declaration of dependencies and simply deduce them transparently based on what the code includes, with some logic to map header to implementation files.
The problem is doing this requires a team to support it that is realistically as large as your average product team. I know Bazel is the solution here but as someone who has used C++, modified build systems and maintained CI for teams for years, I have never gotten it to work for anything more than a toy project.
4 replies →
>Build everything from source within a single unified workspace, cache whatever artifacts were already built with content-addressable storage so that you don't need to build them again.
Which tool do you use for content-addressable storage in your builds?
>You should also avoid libraries, as they reduce granularity and needlessly complexify the logic.
This isn't always feasible though.
What's the best practice when one cannot avoid a library?
1 reply →
"Do not do it" looks like the winning one nowadays.