← Back to context

Comment by bayindirh

2 years ago

All processors are C VMs at the end of the day. They are designed for it, and it's a great language to access raw hardware and raw hardware performance.

I still fail to label C as evil.

P.S.: Don't start with all memory management and related stuff. We have solutions for these everywhere, incl., but not limited to GCs, Rust, etc. Their existence do not invalidate C, and we don't need to abandon it. Horses for courses.

> All processors are C VMs at the end of the day.

That would be a poor argument back in the 80s; and is increasingly wrong for modern processors. Compiler intrinsics can paper-over some of the conceptual gap, but dropping down to inline assembly can't be entirely eliminated (even if it's relegated to core libraries). Lots of C code relies on certain patterns compiling down to specific instructions, e.g. for vectorising; since C itself has no concept of such things. C is based around a 1D memory model which has no concept of cache hierarchies. C has no representation of branch prediction, out-of-order instructions, or pipelines; let alone hyperthreading or multi-core programming.

After all, if processors were "C VMs", then GCC/LLVM/etc. wouldn't be such herculean feats of engineering!

  • This is a subject I love to discuss.

    Exactly. C is based around 1D memory, has no understanding of caches. All of your other arguments are true, too.

    This is why most of the things; caches, memory hierarchies and other modern things are hidden from C (and other languages, or software in general) itself, to trick C, and make it think it's still running on a PDP-11.

    All caches (L1, L2, L3, even disk and caches, and various caches built in RAM) are handled by hardware or OS kernels themselves. Unless they provide an API to talk with, they are invisible and untouchable, unmanageable, and this is by design (esp. the ones baked into hardware like Lx and other buffers).

    All the compilers are the interface perpetuating this smoke and mirrors to not upset C about its assumptions about the machine underlying itself. Even then, a compiler can only command the processor upto a certain point. You can't say that I want these in caches, and evict these. These are automagic processes.

    Exactly, because of these reason, CPUs are C VMs. They do work completely different than a PDP-11, but behave like one at the uppermost level, where compilers are the topmost layer in this toolchain.

    Compilers are such a herculean feats of engineering, because we need to trick that the programs we're building, to make them think they're running on a much simpler hardware. In turn, hardware tries hard to keep this management ovherhead handled by compilers at a bare minimum while allowing higher and higher performance.

    More ponderings, and foundation of my assertion is here: https://dl.acm.org/doi/10.1145/3212477.3212479

    Paper is titled: C Is Not a Low-level Language: Your computer is not a fast PDP-11.

    • Caches, memory hierarchies, out-of-order execution, etc. are hidden from assembly as well as C. One reason for this that isn't mentioned in your comment (or the ACM article) isn't that everyone loves C but rather that most software has to run on a variety of hardware, with differing cache sizes, power consumption targets, etc. Pushing all of that optimization and fine tuning off to the hardware means that software isn't forced to only work on the exact computer model it was designed to run on.

      The author also mentions that alternative computation models would make parallel programming easier, but this neglects the numerous problems that aren't parallelizable. There's a reason why we haven't switched all of our computation to GPUs.

      1 reply →

This is backwards. C was conceived as a way to do the things programmers were already doing in assembler, but with high(er) level language conveniences. In turn , the things they were doing in assembler were done to efficiently use the "VM" their code was executed on.

  • I have linked a paper published in ACM Queue in another comment of mine, which discusses this in depth.

    The gist is, hardware and compilers are hiding all the complexity from C and other programming languages while trying to increase performance, IOW, emulating a PDP-11 while not being a PDP-11.

    This is why C and its descendants are so long lived and performs very well on these systems despite the traditional memory models and simple system models they employ.

    IOW, modern hardware and development tooling creates and environment akin to PDP-11, not unlike VMs emulate other hardware to make other OSes happy.

    So, at the end of the day, processors are C VMs, anyway.