Comment by darksaints

5 months ago

I've been using Linux since 2005, and I've loved it in almost every circumstance. But the drama over the last couple of years surrounding Rust in the kernel has really soured me on it, and I'm now very pessimistic about its future. But I think beyond the emotional outbursts of various personalities, I don't think that the problem is which side is "right". Both sides have extremely valid points. I don't think the problem is actually solvable, because managing a 40M+ SLoC codebase is barely tenable in general, and super duper untenable for something that we rely on for security while running in ring 0.

My best hope is for replacement. I think we've finally hit the ceiling of where monolithic kernels can take us. The Linux kernel will continue to make extremely slow progress while it deals with internal politics fighting against an architecture that can only get bigger and less secure over time.

But what could be the replacement? There's a handful of fairly mature microkernels out there, each with extremely immature userspaces. There doesn't seem to be any concerted efforts behind any of them. I have a lot of hope for SeL4, but progress there seems to be slow mostly because the security model has poor ergonomics. I'd love to see some sort of breakout here.

32 comments

darksaints

dralley 5 months ago

Like 75% of those lines of code are in drivers or architecture-specific code (code that only runs for x86 or ARM or SPARC or POWER etc.)

The amount of kernel code actually executing on any given machine at any given point in time is more likely to be around 9-12 million lines than anywhere near 40 million.

And a replacement kernel won't eliminate the need for hardware drivers for a very wide range of hardware. Again, that's where the line count ramps up.not

darksaints 5 months ago

Yes, of course. But apart from the (current) disadvantage that those drivers don't exist yet, those are all positives in favor of microkernel architectures. All of the massive SLOC codebases run in usermode and with full process isolation, require no specific language compatibility and can be written in any language, do not require upstreaming, and do not require extensive security evaluations from highly capable maintainers who have their focus scattered across 40m lines of code.
scns 5 months ago
The ADMgpu driver alone was over 5 million loc in 2023.
- samus 5 months ago
  
  Most of these are header files. I suspect most of its contents are constants and blobs autogenerated with some tool by AMD.

colonial 5 months ago

Not a kernel guy, but - what's stopping a microkernel from emulating the Linux userspace? I know Microsoft had some success implementing the Linux ABI with WSL v1.0.

I suppose the main objection to that is accepting some degree of lock-in with the existing userspace (systemd, FHS...) over exploring new ideas for userspace at the same time.

anp 5 months ago
FWIW Fuchsia has a not-quite-a-microkernel and has been building a Linux binary compatibility layer: https://fuchsia.dev/fuchsia-src/concepts/starnix?hl=en.
(disclaimer: I work on Fuchsia, on Starnix specifically)
EDIT: for extra HN karma and related to the topic of the posted email thread, Starnix (Fuchsia's Linux compat layer) is written in Rust. It does run on top of a kernel written in C++ but Zircon is much smaller than Linux.
- nechuchelo 5 months ago
  
  Nice to hear fuchsia is still being worked on. I was a bit concerned given there were no new changelogs published for half a year.
- zamadatix 5 months ago
  
  Fascinating area to work in! I've had a few curiosity things come to mind before:
  What's the driving use case for Starnix? Well, obviously "run Linux apps on Fuchsia" like the RFC for it says... but "very specific apps as part of a specific use case which might be timeboxed" or "any app for the foreseeable future"?
  How complete in app support do you currently consider it compared to something like WSL1?
  What are your thoughts about why WSL2 went the opposite direction?
  Thanks!
  
  2 replies →

cardanome 5 months ago

The rust drama is completely overblown considering rust is still years away from being a viable replacement. Sure it makes sense to start experimenting and maybe write a few drivers in rust but many features are still only available in nightly rust.

I suspect many rust devs tend to be on the younger side, while the old C guard sees Linux development in terms of decades. Change takes time.

Monolithic kernels are fine. The higher complexity and worse performance of a microkernel design are mostly not worth the theoretical architectural advantages.

If you wanted to get out of the current local optimum you would have to think outside of the unix design.

The main treat for Linux is the Linux Foundation that is controlled by big tech monopolists like Microsoft and only spends only a small fraction on actual Kernel development. It is embrace, extend, extinguish all over but people think Microsoft are the good guys now.

chippiewill 5 months ago

> but many features are still only available in nightly rust.
Nope. The features are all in stable releases (Since last Spring in fact). However some of the features are still marked as unstable/experimental and have to be opted-in (so could in theory have breaking changes still). They're entirely features that are specific to kernel development and are only needed in the rust bindings layer to provide safe abstractions in a kernel environment.

bigfatkitten 5 months ago

> I have a lot of hope for SeL4, but progress there seems to be slow mostly because the security model has poor ergonomics.

seL4 has its place, but that place is not as a Linux replacement.

Modern general purpose computers (both their hardware, and their userspace ecosystems) have too much unverifiable complexity for a formally verified microkernel to be really worthwhile.

yencabulator 5 months ago
Oh don't worry, seL4 isn't formally proven on any multicore computer anyway.
And the seL4 core architecture is fundamentally "one single big lock" and won't scale at all to modern machines. The intended design is that each core runs its own kernel with no coordination (multikernel a la Barrelfish) -- none of which is implemented.
So as far as any computer with >4 cores is concerned, seL4 is not relevant at this time, and if you wish for that to happen your choice is really either funding the seL4 people or getting someone else to make a different microkernel (with hopefully a lot less CAmkES "all the world is C" mess).
- vacuity 5 months ago
  
  Barrelfish! My dream project is developing a multikernel with seL4's focus on assurance. I want to go even further than seL4's minimalism, particularly with regards to the scheduler. I thiiiiink it doesn't have to be bad for performance. But I've not materialized anything and so I am just delusional. And yes, I am thinking of doing it in Rust. For all of Rust's shortcomings, especially for kernel development, I think it has a lot of promise. I also have the already-loves-Rust cognitive bias. Not trying to somehow achieve seL4's massive verification effort. (Will gasp AI faciliate it? Not likely.) I am sad that Barrelfish hasn't gotten more attention. We need more OS research.
  
  3 replies →
darksaints 5 months ago
I agree that SeL4 won't replace Linux anytime soon, but I beg to differ on the benefits of a microkernel, formally verified or not.
Any ordinary well-designed microkernel gives you a huge benefit: process isolation of core services and drivers. That means that even in the case of an insecure and unverified driver, you still have reasonable expectations of security. There was an analysis of Linux CVE's a while back and the vast majority of critical Linux CVEs to that date would either be eliminated or mitigated below critical level just by using a basic microkernel architecture (not even a verified microkernel). Only 4% would have remained critical.
https://microkerneldude.org/2018/08/23/microkernels-really-d...
The benefit of a verified microkernel like SeL4 is merely an incremental one over a basic microkernel like L4, capable of capturing that last 4% and further mitigating others. You get more reliable guarantees regarding process isolation, but architecturally it's not much different from L4. There's a little bit of clunkiness for writing userpace drivers for SeL4 that you wouldn't have for L4. That's what the LionsOS project is aiming to fix.
- zozbot234 5 months ago
  
  Process isolation of drivers is just not very useful when the driver is interfacing with a device that has full access to system memory. Which is the case for many devices today unless you use IOMMU to prevent this.
  
  6 replies →
- vacuity 5 months ago
  
  Your view is not espoused enough. Thank you for this comment. I'm not suggesting we just go and use seL4 myself, but it's a strong foundation that shows we don't have to be so cynical about the potential of microkernels.
EasyMark 5 months ago
I mean why does it have to be formally verified. Seems to me like the performance tradeoff for microkernels can be worth it to have drivers and other traditional kernel layer code, that don't bring down the system and can just be restarted in case of failures. Probably not something that will work for all hardware, but I would bet the majority would be fine with it.
- darksaints 5 months ago
  
  At this point, even an unverified kernel would be a huge step up in terms of security and reliability.
  And the performance disadvantages of a microkernel are all overblown, if not outright false [1]. Sure, you have to make twice as many syscalls as a monolithic kernel, but you can do it with much better caching behavior, due to the significantly smaller size. The SeL4 kernel is small enough to fit entirely in many modern processors' L2 cache. It's entirely possible (some chip designers have hinted as much), that with enough adoption they could prioritize having dedicated caches for the OS kernel...something that could never be possible with any monolithic kernel.
  [1] https://trustworthy.systems/publications/theses_public/23/Pa...
- kennysoona 5 months ago
  
  > I mean why does it have to be formally verified.
  Because we can and the security advantages are worth it.

timeon 5 months ago

> But what could be the replacement? There's a handful of fairly mature microkernels out there

Redox[0] has advantage that no-one will want to rewrite it in Rust.

[0]: https://redox-os.org/

raspyberr 5 months ago

GNU Mach! GNU Mach! GNU Mach! GNU Mach! GNU Mach! GNU Mach!