Comment by drnick1
8 hours ago
Yes and no. Clearly what you said is true, but the more profound reason is that C just minimally reflects how computers work. The rest is just convention.
8 hours ago
Yes and no. Clearly what you said is true, but the more profound reason is that C just minimally reflects how computers work. The rest is just convention.
More concretely, I think the magic lies in these two properties:
1. Conservation of mass: the amount of C code you put in will be pretty close to the amount of machine code you get out. Aside from the preprocessor, which is very obviously expanding macros, there are almost no features of C that will take a small amount of code and expand it to a large amount of output. This makes some things annoyingly verbose to code in C (eg. string manipulation), but that annoyance is reflecting a true fact of machine code, which is that it cannot handle strings very easily.
2. Conservation of energy: the only work that will be performed is the code that you put into your program. There is no "supervisor" performing work on the side (garbage collection, stack checking, context switching), on your behalf. From a practical perspective, this means that the machine code produced by a C compiler is standalone, and can be called from any runtime without needing a special environment to be set up. This is what makes C such a good language for implementing garbage collection, stack checking, context switching, etc.
There are some exceptions to both of these principles. Auto-vectorizing compilers can produce large amounts of output from small amounts of input. Some C compilers do support stack checking (eg. `-fstack-check`). Some implementations of C will perform garbage collection (eg. Boehm, Fil-C). For dynamically linked executables, the PLT stubs will perform hash table lookups the first time you call a function. The point is that C makes it very possible to avoid all of these things, which has made it a great technology for programming close to the machine.
Some languages excel at one but not the other. Byte-code oriented languages generally do well at (1): for example, Java .class files are usually pretty lean, as the byte-code semantics are pretty close to the Java langauge. Go is also pretty good at (1). Languages like C++ or Rust are generally good at (2), but have much larger binaries on average than C thanks to generics, exceptions/panics, and other features. C is one of the few languages I've seen that does both (1) and (2) well.
It minimally reflects PDP-11 assembly, which is not how modern computers work.
This is a meme which is repeated often, but not really true. If you disagree, please state specifically what property of PDP-11 you think it different from how modern computers work, and where this affects C but not other languages.
It lacked SIMD instructions.
The things complained about in the article are not a minimal reflection of how computers work.
Take the "wobbly types" for example. It would have been more "minimal" to have types tied directly to their sizes instead of having short, int, long, etc.
There isn't any reason that compilers on the same platform have to disagree on the layout of the same basic type, but they do.
The complaints about parsing header files could potentially be solved by an IDL that could compile to c header files and ffi definitions for other languages. It could even be a subset of c that is easier to parse. But nothing like that has ever caught on.
There were many different types of computers back then. Some even had 36 bit word sizes. I don't think there was any clear winner like amd64 back then that they could have prioritized. 16 and 32 bit machines existed in decent amounts and so on.
How does C reflect how AVX works?
> C just minimally reflects how computers work. The rest is just convention.
This hasn't been true for decades. x86 assembly is now itself an abstraction over what the CPU is actually doing.
Microcode, speculative execution, etc.
It seems to be a meme on HN that C doesn't reflect hardware, now you're extending that to assembly. It seems silly to me. It was always an approximation of what happens under the hood, but I think the concepts of pointers, variable sizes and memory layout of structs all represent the machine at some level.
Its not a meme.
For example, C has pointer provenance, so pointers arent just addresses. Thats why type punning is such a mess. If a lang claims to be super close to the hardware this seems like a very weird thing.
1 reply →
> the concepts of pointers, variable sizes and memory layout of structs all represent the machine at some level.
Exactly.
Everything in assembly is still one-to-one in terms of functional/stateful behavior to actual execution. Runtime hardware optimization (pinhole instruction decomposition and reordering, speculative branching, automated caching, etc.) give a performance boost but do not change the model. Doing so would mean it didn't work!
And C is still very close to the assembly, in terms of basic operations. Even if a compiler is able to map the same C operations to different instructions (i.e. regular, SIMD, etc.)
1 reply →