Comment by phicoh
2 years ago
C (and I use the term loosely, not in de strict definition of standard C) is very well suited for low level programming of RISC processors. It not clear that a new language would be a big benefit. The benefit of C is to abstract from assembler and make Unix portable between architectures. C still works very well for that today.
The big question is whether we still want to write new production code in C. For one, the C standard is horrible. Just about everything is undefined behavior. The second is that you don't want to spend a lot of time checking code for array bound errors, use after free, other problems with pointers, variables or data structures that are not (properly) initialized, etc.
In addition, for large projects, all of the manual namespace management is also a bit of a bore.
One big issue with C is that more or less, the CPU architecture has to treat memory as an array of bytes. There is just too much code that assumes that this is the case. Every data pointer can be converted to a void *, all function pointers are alike, etc. With Rust, there is a chance that radically new architectures are possible, if compatibility with C is no longer required.
> C ... is very well suited for low level programming of RISC processors.
I wouldn't dispute that. You actually caught me expressing a raw and incomplete idea.
I'm imagining a totally fictitious set of events. Kernighan and Ritchie sitting in front of a PDP-11 and thinking to themselves: we'd really like a way to get this hardware to support multiple users simultaneously. I mean, the impetus of the language was this transitive dependency from hardware -> user need -> OS -> language to build that OS.
So I am looking at these new RISC-V SoCs that are coming out. I'm thinking about things like tenstorrent and their chiplet designs where you have a RISC-V CPU core surrounded by an ungodly amount of GPU-like tensor cores. Nvidia's Grace Hopper chip designs are kinda-sorta similar but with ARM cores.
We have a brand new kind of system architecture on our hands. And we have a new kind of challenge, which is the need to efficiently use that hardware. Instead of multi-user sharing of a single mini-computer, we have to find a way to saturate the GPU-like tensor cores surrounding a RISC CPU.
So we have a new chain with "new hardware" -> "new user need". So, maybe we do need a "new OS" that isn't POSIX-ish. And if that chain follows, then I wonder what "new language" would look like to complete that chain. Because I doubt that C or Rust are probably the right ones.
It seems we are grabbing tools that happen to be handy and applying them to a new situation without doing the same kind of first-principle, ground-up thinking that appears to have been the kind of thinking that lead to C. In other words, we're asking "what kind of OS can I build with this language?" instead of "what kind of language do I need to build this OS?"
If I remember correctly, Unix already supported multiple users before it was rewritten in C. It was the pdp-7 to pdp-11 transition that prompted the need to rewrite in a higher level language.
A key thing about Unix is that they took existing hardware and did something creative with that.
One thing that is hard about languages that support new hardware is that the core concepts of the new hardware architecture have to be stable enough that the new language can become popular. I don't know enough about GPU architectures, but it seems that they are not very stable and hidden behind abstractions provided by libraries. Which is not an ideal basis for a new language.
> new hardware architecture have to be stable enough that the new language can become popular
The reason why my mind is on this topic is an interview with Jim Keller where he mentioned that when he designed modern CPUs he was conceptually thinking of their front-end as a machine that processes C code (or, the kind of assembly that a compiler made to compile C would generate). Internally the CPU does all kinds of black magic that is provably correct to rewrite that machine code into a form that is performant on modern architectures (e.g. out-of-order processing, pre-fetching). So you end up with things like the Java JVM optimizing high-level languages while also trying to make itself look like a C program, and the hardware providing a front-end to make itself look like a C-processing machine. A bit funny when you think about it. But it all works since this implicit contract has been stable for decades.
But that contract doesn't exist for GPUs or other new kinds of hardware. A lot of modern silicon is "system-on-a-chip" kinda stuff, with multiple CPUS, some high performance and some not, AI accelerator/neural engines, dsps, media endocders/decoders, etc. That trend appears like it will continue.
Keller mentions that Nvidia got ahead of it with Cuda. They presented a C-like interface in front of their GPUs and then just threw insanely complex hardware in front of it and used that to hide the horror spread between the hardware and their drivers. But nowhere else does such a contract exist (and arguably, it doesn't exist even for Nvidia who seem to just barely be holding it together). Now that Keller is designing these SoC he is confronted with the new-normal of no pre-defined contract between software and hardware.
It is a daunting task to face this new reality. But I really wish I saw some evidence of people tackling it from first principles. To sit down, look at the new architectures and dream of new OS paradigms and the new languages that could support it. Because I strongly doubt that such a re-imagining would look anything like Rust.
> the C standard is horrible. Just about everything is undefined behavior
How is this horrible? This is the core defining feature of C. It is equivalent of saying that a phallic shaped object with a sharp edge is terrible because you may cut/stab yourself and bleed to death, therefore dull equivalent is better (always). This is known as a knife. Knives with extremely sharp blades exist because grinding at the thing you're trying to cut for hours isn't something all of us enjoy. Have you tried not hurting yourself instead before calling it terrible? Sure, one idiot may die because of it, some unlucky guy may get killed with one, but really, I prefer using a real knife when cooking my food and I will never change my mind about that until I die.
> In addition, for large projects, all of the manual namespace management is also a bit of a bore.
When you're writing in C, fun is the least of your worries. I can recommend C++ for big projects, noone but yourself forces use of templates and classes in that department. I can't speak for people writing for controllers made in 1984 with proprietary compilers that probably aren't even maintained anymore, but that's not my problem. For projects like Linux, C++ would be more than sufficient. It's ironic that Rust has been accepted to Linux kernel because what he said about C++ over a decade ago couldn't be more true for Rust aswell, even though all you need to fix that is a bit of discipline, ironic given C programming is practically 99.97% of that.
> One big issue with C is that more or less, the CPU architecture has to treat memory as an array of bytes.
Not relevant, virtual memory is used in practically all modern hardware and "virtual" here means that you have arrays and have no clue what it's truly like already in hardware. Even kernels run in this mode, and all kernel code is written with this expectation. Before you make up such bullshit nonsense, you should tell us about this memory model that's superior to dumb arrays, because until then, noone has ever thought of a better one.
There is the classic case of the Linux kernel dereferencing a null point and then check if it is null. Then the compiler removed the null pointer check because dereferencing a pointer means that it cannot be null without invoking undefined behavior.
This is just horrible. C started out as portable assembler. Now essential instructions get deleted. The C standard is a horrible way to define a low-level programming language. Obviously, C is a sharp object. But standard C is a sharp object that jumps at you.
I had a lot of fun writing C. The same way you can have fun writing assembler or do other level stuff. Compared to low-level kernel stuff, C is not a big problem.
I'm talking radically different architectures. Virtual memory has a high cost in the implementation of a CPU. There could be other architectures, capability based, segment based, etc. But if your CPU has to run C efficiently, then you probably don't want to go there.
There is a lot of cruft, like ASLR, that only exists because memory is a flat array of bytes.
As long as the primary goal of a CPU is to run C efficiently, we won't get beyond paged virtual memory.