← Back to context

Comment by derefr

6 years ago

You know, come to think of it, is there anything stopping Linux from having a... FKSE (Filesystem in Kernel SpacE) standard API?

Presumably, such a thing would just be a set of kernel APIs that would parallel the FUSE APIs, but would exist for (DKMS) kernel modules to use, rather than for userland processes to use. Due to the parallel, it would only be the work of a couple hours to port any existing FUSE server over into being such a kernel module.

And, given how much code could be shared with FUSE support, adding support for this wouldn't even require much of a patch.

Seems like an "obvious win", really.

It's not the context switch that kills you for the most part, but the nature of the API and it's lack of direct access to the buffer cache and VMM layer. Making a stable FKSE leads to the same issues.

That's why Windows moved WSL2 to being a kernel running on hyper-v rather than in kernel. Their IFS (installable filesystem driver) stack screws up where the buffer cache manager is, and it was pretty much impossible to change. At that point, the real apples to apples comparison left NT lacking. Running a full kernel in another VM ended up being faster because of this.

I mean, it doesn't really work that way, you can't just port a userspace program into a kernel module. For starters, there's no libc in the kernel - what do you do when you want to call `malloc`? ;)

With that, I doubt the performance issues are directly because it runs in userspace, they're likely due to the marshaling/transferring from the in-kernel APIs into the FUSE API (And the complexity that comes with talking to userspace for something like a filesystem), as well as the fact that the FUSE program has to call back into the kernel via the syscall interface. Both of those things are not easily fixable - FKSE would still effectively be using the FUSE APIs, and syscalls don't translate directly into callable kernel functions (and definitely not the ones you should be using).

The hard part isn't the "FKSE API", the hard part is for the "FKSE driver" to be able to do anything other than talk to that API. Like, scheduling, talking to storage, the network, whatever is needed to actually implement a useful filesystem.

The problem is that nobody is interested in doing that and that's why we are in this situation in the first place. If Oracle wanted to integrate ZFS into Linux they would just relicense it.