Comment by cpgxiii

3 days ago

The design of WSL(1) makes more sense when you think of its original design goal of being a compatibility layer for Android apps. Android is "a Linux", but it is (1) a relatively unique one, and (2) everything between the Android kernel and Android apps really isolates the application layer from kernel-level details. Given this separation, it makes a lot of sense to leverage the existing NT flexibility and emulate a Linux kernel at the syscall layer. Sure, you'll have to patch some parts of the WSL(1) Android system components, but MS was already going to have to do that to work around Google-only components. In many ways, this route is no more complex than what Blackberry did to run Android apps atop their QNX-based OS.

But once you give up the specialization for Android and want WSL to be a "real Linux" (i.e. behave like a specific Ubuntu/Fedora/etc distribution) now you no longer can get away with being Linux-like, you have to be Linux (i.e. you need the syscall layer to directly mirror all kernel development and features). It's actually fairly impressive how much worked with WSL(1) given how different the internals were, but you didn't have to go that far to find tools/services/etc that just wouldn't work.

Instead, once you consider how long MS had been working on Hyper-V, and how interested they are in using it to apply additional security boundaries/isolation (e.g. VBS) within what outwardly appears to be a single Windows OS instance to the user, it makes a lot of sense to leverage that same approach to just run a real Linux kernel atop Hyper-V. In that world, you no longer have to match Linux kernel development, you just need to develop/maintain the kernel drivers used to interact with Hyper-V - and MS already had a lot of experience and need to do that given how much of Azure is running Linux VMs.

Full VMs make much more sense indeed, because they allow you to run fundamentally different OS while still keeping the host mostly the same. And they basically show that with the FreeBSD addition.

The syscall way is just a form of emulation that you have to contain and it becomes a pain to keep up to date. VMs will use more ressource but at least they are disposable and only require a good virtualization layer on the host.

Funnily enough, with time, Microsoft might be able to run all the OSs inside their own OS. Of course, that won't happen for something like macOS but that would be hilarious.

IO on many little files is dramatically faster in Linux on ext4 vs Windows thanks to NTFS' journaling overhead. So if you're doing development, you really want to do it inside wsl2.

  • I think it's more complicated just than NTFS's design.

    In my original comment I said that the difference is the Linux VFS for a reason. The slow part in NT is when you go from a filename to a handle. Doing things like caching lookups by name is, IIRC, the responsibility of the individual drivers. Linux does better at this by having a heavily optimized layer sitting between the filesystem driver and the caller. Doing tons of open(2)s is faster on Linux because of the overall kernel design.

  • Journaling isn't the issue, small files go into the $MFT which is the fast-path. The issue is the file system filter overhead.