← Back to context

Comment by justmarc

1 year ago

I'm interested in these kind of kernels to run very high performance network/IO specific services on bare metal, with minimal system complexity/overheads and hopefully better (potential) stability and security.

The big concern I have however is hardware support, specifically networking hardware.

I think a very interesting approach would be to boot the machine with a FreeBSD or Linux kernel, just for the purposes of hardware as well as network support, and use a sort of Rust OS/abstraction layer for the rest, bypassing or simply not using the originally booted kernel for all user land specific stuff.

Couldn't you just boot the Linux kernel directly and launch a generic app as pid 1 instead of a full blown init system with a bunch of daemons?

That's basically what you're getting with Docker containers and a shared kernel. AWS Lambda is doing something similar with dedicated kernels with Firecracker VMs

  • Yes, you can. You can even have a different Pid 1 configure whatever and then replace it's core image with the new Pid 1.

  • Yes, but I wanted to bypass having the complexity of the Linux kernel completely, too.

    Basically single app directly to network (the world) and as little as possible else in between.

    • Linux kernel is not complex. Most of the code runs lock-free. For example, the slab allocator in the kernel uses only a single double_cmpxhg instruction to allocate an object via kmalloc(). The algorithm scales to any number of CPUs and has NUMA awareness. Basically, the most concurrent, lowest allocation latency allocator you can get in the market, which also returns the best objects for the requesting process on big memory systems.

      The complexity on the other hand is architectural and logical to achieve scale to hundreds of CPUs, maximise bandwidth and reduce latency as much as possible.

      Any normal Rust kernel will either have issues scaling on multi-cores or use tax-heavy synchronisation primitives. The kernel RCU and lock-free algorithm took a long time to be discovered and become mature and optimised aggressively to cater for the complex modern computer architectures of out-of-order execution, pipelining, complex memory hierarchies (especially when it comes to caching) and NUMA.

      2 replies →

If you want truly high-performance networking, you can bypass the kernel altogether with DPDK. So you don't have to worry about alternative kernels for other tasks at all. On the downside, DPDK takes over the NIC entirely, removing the kernel from the equation, so if you need the kernel to see network traffic for some reason, it won't work for you.

You can check out hardware support here: https://core.dpdk.org/supported/nics/

  • This was true a decade ago, with modern io_uring dpdk is probably an anti-pattern.

    • I'm not sure that's true for a good chunk of the workloads that dpdk really shines on.

      A lot of the benefit of dpdk is colocating your data and network stack in the same virtual memory context. io_uring I can see getting you there if you have you're serving fixed files as a cdn kind of like netflix's appliances, but for cases where you're actually doing branchy work on the individual requests, dpdk is probably a little easier to scale up to the faster network cards.

i might be wrong but if it's ABI compatible the same drivers will work?

p.s.: i was wrong

>While we prioritize compatibility, it is important to note that Asterinas does not, nor will it in the future, support the loading of Linux kernel modules.

https://asterinas.github.io/book/kernel/linux-compatibility....

  • Linux doesn't even maintain ABI compatibility with itself, nobody else is going to manage it. The possibility that might work is there's a couple projects that maintain just enough API compatibility to reuse driver code from Linux (IIRC FreeBSD does this for some graphics drivers). But even then you're gambling with whether Linux decides to change implementation details one day, since internal APIs explicitly aren't stable.

    • The Linux kernel community takes ABI compatibility for userland very seriously. That developers in userland are frequently unwilling to understand issues surrounding ABI stability is not the fault of the Linux kernel.

      5 replies →

  • They mention this in https://github.com/asterinas/asterinas/blob/2af9916de92f8ca1...

    > While we prioritize compatibility, it is important to note that Asterinas does not, nor will it in the future, support the loading of Linux kernel modules.

    • It's a lot "simpler" to support a Linux userland as that means one needs to "just" emulate all the Linux syscalls, than to implement the literally countless internal APIs needed for drivers etc, as that would otherwise mean literally reimplementing the whole Linux kernel and that's neither realistic, nor too useful.

      4 replies →

  • No, it means you can run Linux userland/apps on this kernel, to the level/depth which they currently support of course.

    They might not yet implement everything that's needed to boot a standard Linux userland but you could say boot straight into a web server built for Linux, instead of booting into init for example.

  • in general the ABI is kernel<->user space while the ABI (and potentially even API) on the inside (i.e. for drivers) can change with every kernel version (part of why it's so important to maintain drivers in-tree)