Comment by simonask
9 hours ago
I think the article's point is that you don't actually have to get weird at all to run into UB.
Lots of people mistakenly think that C and C++ are "really flexible" because they let you do "what you want". The truth of the matter is that almost every fancy, powerful thing you think you can do is an absolute minefield of UB.
My go-to example of "UB is everywhere" is this one:
Which is UB for certain values of x.
C23 removed the whole stuff about indeterminate value and trap representation. Underflow/overflow being silent or not is implementation defined.
Signed overflow is just undefined.
I would agree that C is "really flexible", but I would say it's primarily flexible because it lets you cast say from a void pointer to a typed pointer without requiring much boilerplate. It's also flexible because it lets you control memory layout and resource management patterns quite closely.
If you want to be standards correct, yes you have to know the standard well. True. And you can always slip, and learn another gotcha. Also true. But it's still extremely flexible.
The problem is that a lot of the flexibility introduced by UB doesn't serve the developer.
Take signed integer overflow, for example. Making it UB might've made sense in the 1970s when PDP-1 owners would've started a fight over having to do an expensive check on every single addition. But it's 2026 now. Everyone settled on two's complement, and with speculative execution the check is basically free anyways. Leaving it UB serves no practical purpose, other than letting the compiler developer skip having to add a check for obscure weird legacy architectures. Literally all it does is serve as a footgun allowing over-eager optimizations to blow up your program.
Although often a source of bugs, C's low-level memory management is indeed a great source of flexibility with lots of useful applications. It's all the other weird little UB things which are the problem. As the article title already states: writing C means you are constantly making use of UB without even realizing it - and that's a problem.
If we're talking two's complement it's not undefined that is right. Having to emit checks though, that is where I beg to differ. A check is only useful if you want to actually change the behavior when it happens, otherwise it is useless. Furthermore, it might be "essentially free" from a branch prediction point, but low and behold caches exist. You would pollute both the instruction cache with those instructions _and_ the branch prediction cache. From this it doesn't follow at all, that there is no cost.
In the end small things do add up, and if you're adding many little things "because it doesn't cost much nowadays" you will end up with slow software and not have one specific bottleneck to look at. I do agree that having the option for checked operations is nice (see C#), but I have needed this behavior (branching on overflow) exactly once so far.
2 replies →
Signed overflow checks are typically not free unfortunately they have a cost of about 5% or thereabouts
5 replies →
You can run your code under ASAN and UBSAN nowadays, it will catch many or most of issues as they happen.
But that's completely besides the point. UB on signed overflow, or really most of UB, is not unrelated to C flexibility. It is a detail of the spec related to portability and performance. IIRC it is even required to make such trivial optimizations as turning
into
saving arithmetics and saving a register, on architectures where `int` is smaller than pointers. But there is also options like -fwrapv on GCC for example, allowing you to actually use signed overflow.
3 replies →
It's not flexible in practice, because knowing the standard isn't optional. If you make the choice to not follow the standard, you're making the choice to write fundamentally broken software. Sometimes with catastrophic consequences.
I'm making the choice to pass pointers as void to get low-friction polymorphism. I'm making the choice to control the memory layout of my data structures, including of levels and type of indirection. I'm making the choice to control my own memory allocators and closely control lifetimes, closely control (almost) everything that happens in the system.
That has nothing to do with not following the standard.
6 replies →
At which point it feels like some sort of high-level assembly-like language, which is simple enough to compile efficiently and stay crossplatform, with some primitives for calls, jumps, etc. could find a nice niche.
Maybe this already exists, even? A stripped down version of C? A more advanced LLVM IR? I feel like this is a problem that could use a resolution, just maybe not with enough of a scale for anyone to bother, vs. learning C, assembly of given architecture, or one of the new and fancy compiled languages.
Yes, there have been quite a few C inspired Assembly languages for DSPs for example, TI had one.
There's Vale [0] as a structured high-level assembly language, but pretty far from usable right now. I do hope it matures. Basically: All non-control-flow instructions can be directly supported. Control flow is lofted to a higher level and implemented in C-style structured blocks and keywords, which map directly to a subset of the ISA that modifies the program counter. This separation means it's not a proper superset of traditional assembly languages -- you can't paste in arbitrary blocks of existing code -- but a lot of interesting things (for them, implementations of cryptographic primitives) are pretty trivial to port over. And in exchange, you get a well defined Hoare logic that can talk about total correctness, not just [1]'s partial correctness.
[0] https://github.com/project-everest/vale
[1] https://nickbenton.name/coqasm.pdf
Well, Zig is aiming to be a "saner C", and mostly succeeding so far. I hope they make it to production.
Rust is a somewhat more thorough attempt to actually course-correct.
It is basically what you can have today with Object Pascal or Modula-2, with a revamped syntax for C crowds.