Comment by munificent

6 hours ago

> why C gets to be the foundation for how systems software is written.

Is there an answer here more interesting than "it's what Unix and Windows were written in, so that's how programs talked to the OS, and once you have an interface, it's impossible to change"?

Yes and no. Clearly what you said is true, but the more profound reason is that C just minimally reflects how computers work. The rest is just convention.

  • More concretely, I think the magic lies in these two properties:

    1. Conservation of mass: the amount of C code you put in will be pretty close to the amount of machine code you get out. Aside from the preprocessor, which is very obviously expanding macros, there are almost no features of C that will take a small amount of code and expand it to a large amount of output. This makes some things annoyingly verbose to code in C (eg. string manipulation), but that annoyance is reflecting a true fact of machine code, which is that it cannot handle strings very easily.

    2. Conservation of energy: the only work that will be performed is the code that you put into your program. There is no "supervisor" performing work on the side (garbage collection, stack checking, context switching), on your behalf. From a practical perspective, this means that the machine code produced by a C compiler is standalone, and can be called from any runtime without needing a special environment to be set up. This is what makes C such a good language for implementing garbage collection, stack checking, context switching, etc.

    There are some exceptions to both of these principles. Auto-vectorizing compilers can produce large amounts of output from small amounts of input. Some C compilers do support stack checking (eg. `-fstack-check`). Some implementations of C will perform garbage collection (eg. Boehm, Fil-C). For dynamically linked executables, the PLT stubs will perform hash table lookups the first time you call a function. The point is that C makes it very possible to avoid all of these things, which has made it a great technology for programming close to the machine.

    Some languages excel at one but not the other. Byte-code oriented languages generally do well at (1): for example, Java .class files are usually pretty lean, as the byte-code semantics are pretty close to the Java langauge. Go is also pretty good at (1). Languages like C++ or Rust are generally good at (2), but have much larger binaries on average than C thanks to generics, exceptions/panics, and other features. C is one of the few languages I've seen that does both (1) and (2) well.

  • The things complained about in the article are not a minimal reflection of how computers work.

    Take the "wobbly types" for example. It would have been more "minimal" to have types tied directly to their sizes instead of having short, int, long, etc.

    There isn't any reason that compilers on the same platform have to disagree on the layout of the same basic type, but they do.

    The complaints about parsing header files could potentially be solved by an IDL that could compile to c header files and ffi definitions for other languages. It could even be a subset of c that is easier to parse. But nothing like that has ever caught on.

  • It minimally reflects PDP-11 assembly, which is not how modern computers work.

    • This is a meme which is repeated often, but not really true. If you disagree, please state specifically what property of PDP-11 you think it different from how modern computers work, and where this affects C but not other languages.

  • > C just minimally reflects how computers work. The rest is just convention.

    This hasn't been true for decades. x86 assembly is now itself an abstraction over what the CPU is actually doing.

    Microcode, speculative execution, etc.

    • It seems to be a meme on HN that C doesn't reflect hardware, now you're extending that to assembly. It seems silly to me. It was always an approximation of what happens under the hood, but I think the concepts of pointers, variable sizes and memory layout of structs all represent the machine at some level.

      1 reply →

It wasn't a coincidence, or an accident. C was specifically designed to write Unix, by people who had experience with a lot of other computer languages, and had programmed other operating systems including Multics and some earlier versions of Unix. They knew exactly what they were doing, and exactly what they wanted.

  • I'm not sure what you mean by "coincidence" or "accident" here.

    C is a pretty OK language for writing an OS in the 70s. UNIX got popular for reasons I think largely orthogonal to being written in C. UNIX was one of the first operating systems that was widely licensed to universities. Students were obliged to learn C to work with it.

    If the Macintosh OS had come out first and taken over the world, we'd probably all be programming in Object Pascal.

    When everyone wanted to program for the web, we all learned JavaScript regardless of its merits or lack thereof.

    I don't think there's much very interesting about C beyond the fact that it rode a platform's coattails to popularity. If there is something interesting about it that I'm missing, I'd definitely like to know.

    • It is often said that C became popular just because Unix was popular, due to being free -- it just "rode its coattails" as you put it.

      As if you could separate Unix from C. Without C there wouldn't have been any Unix to become popular, there wouldn't have been any coattails to ride.

      C gave Unix some advantages that other operating systems of the 1970s and 80s didn't have:

      Unix was ported to many different computers spanning a large range of cost and size, from microcomputers to mainframes.

      In Unix both the operating system and the applications were written in the same language.

      The original Unix and C developers wrote persuasive books that taught the C language and demonstrated how to do systems programming and application programming in C on Unix.

      Unix wasn't the first operating system to be written in a high-level language. The Burroughs OS was written in Algol, Multics was written in PL/I, and much of VMS was written in BLISS. None of those languages became popular.

      IN the 1970s and 80s, Unix wasn't universal in universities. Other operating systems were also widely used: Tenex, TOPS-10, and TOPS-20 on DEC-10s and 20s, VMS on VAXes. But their systems languages and programming cultures did not catch on in the same way as C and Unix.

      The original Macintosh OS of the 1980s was no competitor to Unix. It was a single user system without integrated network support. Apple replaced the original Macintosh OS with a system based on a Unix.

    • First to market is not necessarily the best, case in point: many video sites existed before Youtube, including ones based on Apple Quicktime. But in the end Flash won.

      To me it looks like there is a better way to do things and the better one eventually wins.

    • > I'm not sure what you mean by "coincidence" or "accident" here.

      I mean Unix had to be written in C, not in, say, Algol or PL/I or BLISS, high-level languages used to write other operating systems.

      I also meant that the features of C were not put there by impulse or whim, they were the outcome of considered decisions guided by the specific needs of Unix.

    • Repeating a previous comment of mine (https://news.ycombinator.com/item?id=32784959) about an article in Byte Magazine (August 1983) on the C programming language:

      From page 52:

      > Operating systems have to deal with some very unusual objects and events: interrupts; memory maps; apparent locations in memory that really represent devices, hardware traps and faults; and I/O controllers. It is unlikely that even a low-level model can adequately support all of these notions or new ones that come along in the future. So a key idea in C is that the language model be flexible; with escape hatches to allow the programmer to do the right thing, even if the language designer didn't think of it first.

      This. This is the difference between C and Pascal. This is why C won and Pascal lost - because Pascal prohibited everything but what Wirth thought should be allowed, and Wirth had far too limited a vision of what people might need to do. Ritchie, in contrast, knew he wasn't smart enough to play that game, so he didn't try. As a result, in practice C was considerably more usable than Pascal. The closer you were to the metal, the greater C's advantage. And in those days, you were often pretty close to the metal...

      Later, on page 60:

      > Much of the C model relies on the programmer always being right, so the task of the language is to make it easy what is necessary... The converse model, which is the basis of Pascal and Ada, is that the programmer is often wrong, so the language should make it hard to say anything incorrect... Finally, the large amount of freedom provided in the language means that you can make truly spectacular errors, far exceeding the relatively trivial difficulties you encounter misusing, say, BASIC.

      Also true. And it is true that the "Pascal model" of the programmer has quite a bit of truth to it. But programmers collectively chose freedom over restrictions, even restrictions that were intended to be for their own good.