Comment by coffeeaddict1
6 months ago
The C++ Core Guidelines have existed for nearly 10 years now. Despite this, not a single implementation in any of the three major compilers exists that can enforce them. Profiles, which Bjarne et al have had years to work on, will not provide memory safety[0]. The C++ committee, including Bjarne Stroustrup, needs to accept that the language cannot be improved without breaking changes. However, it's already too late. Even if somehow they manage to make changes to the language that enforce memory safety, it will take a decade before the efforts propagate at the compiler level (a case in point is modules being standardised in 2020 but still not ready for use in production in any of the three major compilers).
> The C++ committee, including Bjarne Stroustrup, needs to accept that the language cannot be improved without breaking changes.
The example in the article starts with "Wow, we have unordered maps now!" Just adding things modern languages have is nice, but doesn't fix the big problems. The basic problem is that you can't throw anything out. The mix of old and new stuff leads to obscure bugs. The new abstractions tend to leak raw pointers, so that old stuff can be called.
C++ is almost unique in having hiding ("abstraction") without safety. That's the big problem.
I find the unordered_map example rather amusing. C++’s unordered_map is, somewhat infamously, specified in an unwise way. One basically cannot implement it with a modern, high performance hash table for at least two reasons:
1. unordered_map requires some bizarre and not widely useful abilities that mostly preclude hash tables with probing:
https://stackoverflow.com/questions/21518704/how-does-c-stl-...
2. unordered_map has fairly strict iteration and pointer invalidation rules that are largely incompatible with the implementations that turn out to be the fastest. See:
> References and pointers to either key or data stored in the container are only invalidated by erasing that element, even when the corresponding iterator is invalidated.
https://en.cppreference.com/w/cpp/container/unordered_map
And, of course, this is C++, where (despite the best efforts of the “profiles” people), the only way to deal with lifetimes of things in containers is to write the rules in the standards and hope people notice. Rust, in contrast, encodes the rules in the type signatures of the methods, and misuse is deterministically caught by the compiler.
Like std::vector, std::unordered_map also doesn't do a good job on reservation, I've never been entirely sure what to make of that - did they not care? Or is there some subtle reason why what they're doing made sense on the 1980s computers where this was conceived?
For std::vector it apparently just didn't occur to C++ people to provide the correct API, Bjarne Stroustrup claims the only reason to use a reservation API is to prevent reference and iterator invalidation. -shrug-
[std::unordered_map was standardised this century, but, the thing standardised isn't something you'd design this century, it's the data structure you'd have been shown in an undergraduate Data Structures class 40 years ago.]
3 replies →
You absolutely can throw things out, and they have! Checked exceptions, `auto`, and breaking changes to operator== are the two I know of. There were also some minor breaking changes to comparison operators in C++20.
They absolutely could say "in C++26 vector::operator[] will be checked" and add an `.at_unsafe()` method.
They won't though because the whole standards committee still thinks that This Is Fine. In fact the number of "just get good" people in the committee has probably increased - everyone with any brains has run away to Rust (and maybe Zig).
> auto
It took me several reads to figure out that you probably meant ‘auto’ the storage class specifier. And now I’m wondering whether this was ever anything but a no-op in C++.
> "in C++26 vector::operator[] will be checked"
Every major project in that cares about perf and binary size would disable the option that compiler vendors would obviously provide, like -fno-exceptions.
Rust memory and type system offer stronger guarantees, leading to better optimization of bound checks, AFAIK.
There are more glaring issues to fix, like std::regex performance and so on.
"just get good" implies development processes that catch memory and safety bugs. Meaning what they are really saying between the lines is that the minimum cost of C++ development is really high.
Any C++ code without at least unit tests with 100% test coverage on with UB sanitizer etc, must be considered inherently defective and the developer should be flogged for his absurd levels of incompetence.
Then there is also the need for UB aware formal verification. You must define predicates/conditions under which your code is safe and all code paths that call this code must verifiably satisfy the predicates for all calls.
This means you're down to the statically verifiable subset of C++, which includes C++ that performs asserts at runtime, in case the condition cannot be verified at compile time.
How many C++ developers are trained in formal verification? As far as I am aware, they don't exist.
Any C++ developers reading this who haven't at least written unit tests with UB sanitizer for all of their production code should be ashamed of themselves. If this sounds harsh, remember that this is merely the logical conclusion of "just get good".
1 reply →
That explains very well why rust (to me) feels like C++ommitte-designed, thanks for that!
5 replies →
While I sort of agree on the complaint, personally I think the best spot of C++ in this ecosystem is still on great backward-compatibility and marginal safety improvements.
I would never expect our 10M+ LOC performance-sensive C++ code base to be formally memory safe, but so far only C++ allowed us to maintain it for 15 years with partial refactor and minimal upgrade pain.
I think at least Go and Java have as good backwards compatibility as C++.
Most languages take backwards compatibility very seriously. It was quite a surprise to me when Python broke so much code with the 3.12 release. I think it's the exception.
I don't know about go, but java is pathetic. I have 30 years old c++ programs that work just fine.
However, an application that I had written to be backward compatible with java 1.4, 15 years ago, cannot be compiled today. And I had to make major changes to have it run on anything past java 8, ~10 years ago, I believe.
6 replies →
Java has had shit backwards compatibility for as long as I have had to deal with it. Maybe it's better now, but I have not forgotten the days of "you have to use exactly Java 1.4.15 or this app won't work"... with four different apps that each need their own different version of the JRE or they break. The only thing that finally made Java apps tolerable to support was the rise of app virtualization solutions. Before that, it was a nightmare and Java was justly known as "the devil's software" to everyone who had to support it.
8 replies →
Language is improving (?), although IME it went besides the point I'm finding new features to be less useful for every day code. I'm perfectly happy with C++17/20 for 99% of the code I write. And keeping the backwards compatibility for most of the real-world software is a feature not a bug, ok? Breaking it would actually make me go away from the language.
Clion, clang tidy and Visual C++ analysers do have partial support for the Core Guidelines, and they can be enforced.
Granted, it is only those that can be machine verified.
Office is using C++20 modules in production, Vulkan also has a modules version.
>Despite this, not a single implementation in any of the three major compilers exists that can enforce them
Because no one wants it enough to implement it.
I feel like a few decades ago, standards intended to standardize best practices and popular features from compilers in the field. Dreaming up standards that nobody has implemented, like what seems to happen these days, just seems crazy to me.
It's bottom-up vs top-down design.
Or it's better to have other languages besides from C++ for that.
I hoped Sean would open source Circle. It seemed promising, but it's been years and don't see any tangible progress. Maybe I am not looking hard enough?
He's looking to sell Circle. That must be the reason he's not open sourcing it.
Huh, I guess if that was the motivation all along.
I think Carbon is more promising to be honest. They are aiming for something production-ready in 2027.
Profiles will not provide perfect memory safety, but they go a long way to making things better. I have 10 million lines of C++. A breaking change (doesn't matter if you call it new C++ or Rust) would cost over a billion dollars - that is not happening. Which is to say I cannot use your perfect solution, I have to deal with what I have today and if profiles can make my code better without costing a full rewrite then I want them.
Changes which re-define the language to have less UB will help you if you want safety/ correctness and are willing to do some work to bring that code to the newer language. An example would be the initialization rules in (draft) C++ 26. Historically C++ was OK with you just forgetting to initialize a primitive before using it, that's Undefined Behaviour in the language so... if that happens too bad all bets are off. In C++ 26 that will be Erroneous Behaviour and there's some value in the variable, it's not always guaranteed to be valid (which can be a problem for say, booleans or pointers) but just looking at the value is no longer UB and if you forgot to initialize say an int, or a char, that's fine since any possible bit sequence is valid, what you did was an error, but it's not necessarily fatal.
If you're not willing to do any work then you're just stuck, nobody can help you, magic "profiles" don't help either.
But, if you're willing to do work, why stop at profiles? Now we're talking about a price and I don't believe that somehow the minimum assignable budget is > $1Bn
The first part is why I'm excited for future C++ - they are making things better.
The reason I life profiles is they are not all or nothing. I can put them in new code only, or maybe a single file that I'm willing to take the time to refactor. Or at least so I hope, it remains to be seen if that is how they work out. I've been trying to figure out how to make rust fit in, but std::vector<SomeVirtualInterface> is a real pain to wrap into rust and so far I haven't managed to get anything done there.
The $1 billion is realistic - this project was a rewrite of a previous product that became unmaintainable and inflation adjusted the cost was $1 billion. You can maybe adjust that down a little if we are more productive, but not much. You can adjust it down a lot if you can come up with a way to keep our existing C++ and just extend new features and fix the old code only where it really is a problem. The code we have written in C++98 (because that was all we had in 2010) still compiles with the latest C++23 compiler and since there are no know bugs it isn't worth updating that code to the latest standards even though it would be a lot easier to maintain (which we never do) if we did.
1 reply →
This seems bad actually.
Enforcing style guidelines seems like an issue that should be tackled by non-compiler tools. It is hard enough to make a compiler without rolling in a ton of subjective standards (yes, the core guidelines are subjective!). There are lots of other tools that have partial support for detecting and even fixing code according to various guidelines.
It's part of a compiler ecosystem. ie. The front end is shared.
See clang-tidy and clang analyzer for example.
ps: That's what I like most about the core guidelines, they are trying very hard to stick to guidelines (not rules) that pretty much uncontroversially make things safer _and_ can be checked automatically.
They're explicitly walking away from bikeshed paintings like naming conventions and formatting.
The core guidelines aren't as subjective as other guidelines but they are still subjective. There is plenty of completely sound code out there that violates the core guidelines. Not only are they subjective, but many of them require someone to think about the best way to write the code and whether the unpopular way to write it is actually better.
I know compiler front ends can be and are used to create tooling. The point is, you shouldn't be required to implement some kinds of checking in the course of implementing a compiler. If you use a compiler, you should not be required to do all this analysis every single time you compile (unless it is enforcing an objectively necessary standard, and the cost of running it is negligible).
What are you talking about, the language gets better with each release. Using C++ today is a hell of a lot better than even 10 years ago. It seems like people hold "memory safety" as the most important thing a language can have. I completely disagree. It turns out you can build awesome and useful software without memory safety. And it's not clear if memory safety is the largest source of problems building software today.
In my opinion, having good design and architecture are much higher on my list than memory safety. Being able to express my mental model as directly as possible is more important to me.
The top memory safety bugs in shipped code for C and C++ are out of bounds array indexing.
Are you sure? I generally see more use-after-free and other lifetime issues.
1 reply →
Does it matter whether it is a common class of bugs or a not so common one? The point is, this is a class of bugs you do not have when picking a different language.
C++ claimed for decades to be about eliminating a class of resource management bugs you can have in C code, that was its biggest selling point. So why is eliminating another class of bugs a nice to have now?
C++ is loosing projects to memory safe languages for decades now, just think of all the business software in Java, scientific SW in python, ... . The industry is moving towards memory safe software for decades now. Rust is just the newest option -- and a very compelling one as it has no runtime environment or garbage collector, just like C++.
> And it's not clear if memory safety is the largest source of problems building software today.
The Chromium team found that
> Around 70% of our high severity security bugs are memory unsafety problems (that is, mistakes with C/C++ pointers). Half of those are use-after-free bugs.
Chromium Security: Memory Safety (https://www.chromium.org/Home/chromium-security/memory-safet...)
Microsoft found that
> ~70% of the vulnerabilities Microsoft assigns a CVE each year continue to be memory safety issues
A proactive approach to more secure code (https://msrc.microsoft.com/blog/2019/07/a-proactive-approach...)
It’s possible you hadn’t come across these studies before. But if you have, and you didn’t find them convincing, what did they lack?
- Were the codebases not old enough? They’re anywhere between 15 and 30 years old, so probably not.
- Did the codebases not have enough users? I think both have billions of active users, so I don’t think so.
- Was it a “skill issue”? Are the developers at Google and Microsoft just not that good? Maybe they didn’t consider good design and architecture at any point while writing software over the last couple of decades. Possible!
There’s just one problem with the “skill issue” theory though. Android, presumably staffed with the same calibre of engineers as Chrome, also written in C++ also found that 76% of vulnerabilities were related to memory safety. We’ve got consistency, if nothing else. And then, in recent years, something remarkable happened.
> the percentage of memory safety vulnerabilities in Android dropped from 76% to 24% over 6 years as development shifted to memory safe languages.
Eliminating Memory Safety Vulnerabilities at the Source (https://security.googleblog.com/2024/09/eliminating-memory-s...)
They stopped writing new C++ code and the memory safety vulnerabilities dropped dramatically. Billions of Android users are already benefiting from much more secure devices, today!
You originally said
> And it's not clear if memory safety is the largest source of problems building software today.
It is possible to defend this by saying “what matters in software is product market fit” or something similar. That would be technically correct, while side stepping the issue.
Instead I’ll ask you, do you still think it is possible to write secure software in C++, but just trying a little harder. Through “good design and architecture”, as your previous comment implied.
Two of the biggest use cases for modern C++ are video games and HFT, where memory safety is of absolutely minimal importance (unless you're writing some shitty DRM/anticheat). I work in HFT using modern C++ and bugs related to memory safety are vanishingly rare compared to logic and performance bugs.
4 replies →
> Around 70% of our high severity security bugs are memory unsafety problems
> ~70% of the vulnerabilities Microsoft assigns a CVE
> 76% of vulnerabilities
What is the difference between the first two (emphasis added) and what you said? Just as a thought experiment...
If I measure a single factor in exclusion to all others I can also find whatever I want in any set of data. Now your point may be valid but it is not what they published and without the full dataset we cannot validate your claim however I can validate that what you claim is no what they claim.
To answer your question in the final paragraph. Yes it is, but it requires the same cultural shift as what it would take to write the same code in rust or swift of golang or whatever other memory safe language you want to pick.
If rust was in fact viable for such a large project, how's the servo project going? That still the resounding success it was expected to be? Rust in the kernel? That going well?
The jury is still out on whether rust will be mass adopted and is able to usurp C/C++ in the domains where C/C++ dominate. It may get there, but I would much much rather start a new project using C++20 than in rust and I would still be able to make it memory safe and yes it is a "skill issue", but purely because of legacy C++ being taught and accepted in new code in a codebase.
Rules for writing memory safe C++ has not just been around for decades but has be statically checkable for over a decade but for a large project there are too many errors to universally apply them to existing code without years of work. However if you submit new code using old practices you should be held financially and legally responsible just like an actual engineer in another field would be.
It's because we are lax about standards that it's even an issue.
As a note, if you see an Arc<Mutex<>> in rust outside of some very specific Library code whoever wrote that code probably wouldn't be able to write the same code in a memory and thread safe manner, also that is an architectural issue.
Arc and Mutex are synchronisation primatives that are meant to be used to build datastructures and not in "userspace" code. It's a strong code smell that is generally accepted in Rust. Arc probably shouldn't even need to exist at all because that is a clear indication nobody thought about the ownership semantics of the data in question, maybe for some datastructures it is required but you should very likely not be typing it into general code.
If Arc<Mutex<>> is littered throughout your rust codebase you probably should have written that code in C#/Java/Go/pick your poison...
2 replies →
I million times more systems were infiltrated due to PHP SQL injection bugs than were infiltrated via Chromium use-after-free bugs.
Let's keep some sanity and perspective here, please. C++ has many long-standing problems, but banging on the "security" drum will only drive people away from alternative languages. (Everyone knows that "security" is just a fig leaf they use to strong-arm you into doing stuff you hate.)
> Profiles, which Bjarne et al have had years to work on, will not provide memory safety
While I agree with this in a general sense, I think it ought to be quite possible to come up with a "profile" spec that's simply meant to enforce the language restriction/subsetting part of Safe C++ - meaning only the essentials of the safety checking mechanism, including the use of the borrow checker. Of course, this would not be very useful on its own without the language and library extensions that the broader Safe C++ proposal is also concerned with. It's not clear as of yet if these can be listed as part of the same "profile" specifications or would require separate proposals of their own. But this may well be a viable approach.
I have seen 3 different safe c++ proposals (most are not papers yet, but they are serious efforts to show what safe c++ could look like). However there is a tradeoff here. the full bower checker in C++ approach is incompatible with all current C+++ and so adopting it is about as difficult is rewriting all your code in some other language. The other proposals are not as safe, but have different levels of you can use this with your existing code. All are not ready to get added to C++, but they all provide something better and I'm hopeful that something gets into C++ (though probably not before C++32)
>the full bower checker in C++ approach is incompatible with all current C++
Circle is an implementation of C++ that includes a borrow checker and is 100% backwards compatible with C++:
https://www.circle-lang.org/site/index.html
8 replies →
I've seen maybe twice that many. Did one myself once. It's possible to make forward progress, but to get any real safety you have to prohibit some things.