"I Contribute to the Windows Kernel. We Are Slower" (2013)

2 years ago (blog.zorinaq.com)

> Our low performance is not an existential threat to the business.

11 years later, nothing has changed.

Just to name one example of something I ran into last year...: Install the 'az' Azure CLI into your docker image? Boom, 1.4 GB extra space wasted! Why you ask? Well, for one, every subcommand bundles its own Python runtime.

  • That is hardly valid criticism when Linux's answer for binary distribution is for them to bundle their everything (containers).

    • "Linux" does not have a singular answer for binary distribution. Docker is one of many, with differing trade-offs.

    • Docker is horrible as a binary distribution method and luckily it's rarely used as such. Usually it's the apt, pip etc. Docker is for deployment not distribution.

  • welcome to venv, not just windows

    • Python really needs to fix this problem. The language is so full of warts that it's almost akin to the modern PHP.

      I'm hoping for a Python version that makes this zen item [1] the number one priority:

      > "There should be one-- and preferably only one --obvious way to do it."

      [1] https://peps.python.org/pep-0020/

      3 replies →

    • Controversial opinion: the insistence on the venv bullshit is the stupidest decision made in programming languages in the past twenty years, and it is entirely unnecessary.

      I've used Python for more than a decade on Arch Linux, across many machines at home and work. For essentially all of that time, I've been "sudo pip install"-ing to my heart's content. The number of times this has actually caused problems with my own Python scripts is less than the number of times I've had to help colleagues figure out venv bullshit in the past six months alone. The number of times that "sudo pip install" has caused breakage of anything except my own scripts is zero in ten years.

      AFAICT the Python core team has essentially no understanding of the level of sophistication and the actual pain points experienced by 95% of Python users. Python is the software equivalent of duct tape, and it is used accordingly. Putting the duct tape in a box that is hard to open and covered with warning labels is not a meaningful improvement.

      3 replies →

This is not exclusive to Windows Kernel development. It is an increasingly common scenario in all kind of development organizations, worsened by the anti-intellectual Scrum puritans and their simplistic understanding of business value.

This kind of spontaneous, incremental technical improvements usually don't even survive Code Review, because now we all became fanatics of simplicity and dumb code at all costs, not understanding that in some situations a more sophisticated approach is worth the cost.

  • I don't quite get how your second paragraph relates to the first one, could you please elaborate?

    I honestly would consider myself a "fanatic of simplicity and dumb code", but for the simple reason that source code should stay easy to hack at all times. I want to enable "this kind of spontaneous, incremental technical improvements", and also want these hacks to pass code review (just add a "// HACK:" comment).

    But in regard to kernel development I probably underestimate the inherent complexity of 1B+ SLOC bases.

    • At the time I worked at scrum shop, which was great at preventing some kinds of stupid waste of time, and IMO likely better overall than most of the anti-scrum brigade on HN. It was also free of inspired work. Most inspired work is crap. Someone's pet feature isn't customers' favourite feature… but scrum kills all of them, and that's too much.

      That company later set up hackathons: A time-boxed escape from scrum, you can write whatever you want, and if the team likes the look of the hacky prototype afterwards it's adopted. Some things don't sound good before you write code, or some programmers can't tell the right story before writing code, not sure what, it doesn't matter anyway.

I haven't profiled our Windows kernel driver across different kernel versions (maybe I should!) but I'd like to offer this perspective: the kernel is incredibly stable from my driver development point of view. The biggest reason we ship different drivers for Windows 7/8/10 is just that newer WDKs don't support anything older than 10. The kernel has remained remarkably consistent while still offering new features we can take advantage of on non-legacy systems.

  • I don't know why "incredibly stable" is such a remarkable thing, the whole point of an operating system and a kernel is to offer a stable API to write your applications against.

    • >I don't know why "incredibly stable" is such a remarkable thing

      It's remarkable when you look at the landscape of Linux and Mac device drivers.

      Can you run non-kernel drivers for Linux 2.6 on 6.6? Can you install a device driver from 2007 on a modern MacOS? Well, many Windows 7 drivers also work on 11. That's stability.

      3 replies →

    • I'm talking about the kernel space itself, not the APIs exposed to userland to interface with the kernel from your application. Internal APIs and behaviors are mostly identical over the past ~20 years, and any changes are usually moved to a new export. I don't think this should be taken for granted.

My understanding of this Windows vs Linux OS development article is that it is a great description of why open source beats closed source for large complex projects. Please correct me if I am wrong, instead of just downvoting me. I am here to learn

  • Linux succeeded in large part because of Linus, not (just) because it is open source.

    Many open source projects languish, or get mired in petty bickering, same as any other large organisation of humans.

    The successful projects -- closed or open -- often have a strong-willed visionary with the political clout to enforce his way of doing things.

    Big corporations tend to eventually drive those visionaries away, resulting in a bland mess designed by committee with more paperwork written than actual code.

    > "That's literally the explanation for PowerShell. Many of us wanted to improve cmd.exe, but couldn't."

    However, this is the opposite of this effect, and the anonymous MS guy complaining about it in the article is dead wrong.

    Jeffrey Snover developed PowerShell, it's his unique vision, and in this respect he's very much like Linus Torwalds, Guido van Rossum, Larry Wall, or any other such famous developer you care to name.

    It's literally impossible to fix CMD.EXE because meaningful changes to it would be breaking changes. That would destroy backwards compatibility, and is not something anyone responsible would do. Linus keeps saying the mantra: "Don't break the kernel ABI!" as well for a reason. It's not just Microsoft doing this kind of thing. Similarly, Bash and "sh" have barely changed over decades.

    PowerShell v1 and especially v2 were brilliant, elegant, and beautiful. Then Jeffrey Snover went on to do other things, PS got handed over random Microsoft developers, and it slowly started to accumulate inconsistent warts and bugs.

    Jeffrey Snover works for Google now, which does circle back to a valid issue raised in the article: The FAANGs do keep poaching the best people from Microsoft, and Microsoft hasn't done much to fix this. Even from the outside, it's noticeable just how poor the average developer skill is at Microsoft.

    • Linux succeeded in large part because it was free, compared to UNIXes of the time which also were in a licensing and lawsuit hell. But yes, Linus himself is a big part of Linux success and benevolent dictatorship is one of the part.

      > That would destroy backwards compatibility

      People who weren't bitten by it rarely understand that.

      > PowerShell v1 and especially v2 were brilliant, elegant, and beautiful

      Oh yes! One thing what I really liked is what an extremely big part of PS... was written in PS! You literally could drill down some system cmdlets to see, learn and adapt! It started to dwindle down with PS5, sadly, with performance requirements.

The same thing is happening to MacOS, and has been happening for almost a decade now: ship half-baked badly designed mobile-first "features", never fix bugs or regressions or performance issues.

"Besides: you guys have systemd, which if I'm going to treat it the same way I treated NTFS, is an all-devouring octopus monster about crawl out of the sea and eat Tokyo and spit it out as a giant binary logfile."

This made me chuckle. They have a point there, too.

I was developing file system drivers on Windows in the era when this article was written. I've never seen such a convoluted mess of recursive locks. Concurrent trees with fine-grained locks always have the potential to get messy, but the number of ways in NTFS that you can take some non-reentrant lock on which you are also waiting (thus deadlocking) was mind-boggling.

Obviously not the whole picture, but $MSFT is up about 1000% since this was written. Funny how things work out.

  • Not due Windows.

    They sell at lot Linux in Azure successfully. Forced users into Microsoft Teams (worst piece of software…) and lured users to upload their data in the cloud. Before it were the operating system and applications but now - their data is in the hand of others.

    Users? Most ignore their contribution to mass-effects (software scales) and therefore vendor lock-in. Business customers often think in short terms. The fees for Microsoft are high and the software has drawbacks. A migration (to Linux) pays off in long term and requires personnel and a strategy. And things which require time will pay off only later. When the CEO/CTO isn’t working there anymore.

    Linux succeeds. Also on Desktop. Red Hat and Canonical are shipped pre-installed on some ThinkPads and Dell. But you need to need many devices to gain weight. Valve is doing it right. A device people want (Steamdeck) pre-installed with Linux and an eco-system they want to use (Steam).

This needs a 2013 tag, but its even more true today than it was back then, W11 feels like the absolute pinnacle of unoptimized code that is more interested in shiny new untested features vs giving us better performance.

  • I think we are talking about the kernel, not about the OS at large.

    • Not much changed. The problems are pervasive and workarounds rule the day. MS has got more accepting of them though.

      For example they can't make NTFS fast, and they can't seem to make ReFS (their post-NTFS effort) backwards compatible, so now they're shipping it as "Dev Drive". A Dev Drive is the same thing as a normal drive, but not as slow, because they concluded that only developers care about filesystem performance:

      https://learn.microsoft.com/en-us/windows/dev-drive/

      But the userspace also has a lot of weird performance problems, many introduced by their sandboxing efforts. For example their FS broker was extremely slow for a long time, maybe still is.

This problem is everywhere, not just at Microsoft. I had to update our backup software, Veeam Backup & Replication to the latest version and the installation ISO is 9 GB in size! 9 GB for _backup software_.

I note some people in the comments section were quite offended by the "9 to 5 with kids types" remark.

  • I find it funny they try to make people not devoting their full time/energy to a capitalist company as lesser. Trick’s on them, the company is there to exploit to you, you aren’t part of some greater cause. The more of your time you give the company, the more value they get.

    • While I can understand some getting bent out of shape from that kind of comment I know what he means.

      I'm a grey-haired .NET dev (pushing 50 now!) and for me that comment means there is no desire for new devs to learn beyond what they need for the day-job!

      As an example: I learned programming in the 90s at a time when computer resources had to be managed - You couldn't just throw more servers at it since they cost so much money and you had to host them on-prem at the time.

      So I learned the OS inside out. I knew how Windows worked to quite a low-level - Not Mark Russinovich levels but way, way more than colleagues did and all my code, even today, has an eye on performance all the time!

      Back then, when I learned SQL, I spent lots of time tweaking SQL to get it to run faster - I mean the tables, indexes and such, as well as the SQL queries that I had to fix (I wasn't coding then but I had to fix much of the $hit that third parties produced).

      Fast forward to now and you are lucky if 20% of our devs at my company understand SQL properly. They can write basic SQL statements but that's it! They all learned to program using ORMs and can't troubleshoot slow queries. I'm not joking! I had to fix a system that used EF and one of the queries that EF generated caused the DB to try to sort through 14B rows of data when we don't even have that much data in the DB! Our largest database had almost no indexes on it and they kept adding more CPU's to get it to run faster... it didn't! I've subsequently fixed it and we've dropped the CPU count quite a bit too.

      I think Scott Hanselmann talked about it a few years ago. The analogy he used was to do with the kitchen taps in his house: they broke, and his wife's understanding of how the plumbing worked stopped at the point where the pipe went into the wall, so she had no idea what could be wrong! His argument (if I remember correctly) was that if you just try to understand what happens when the pipe goes into the wall then you'll open up a whole new world of understanding.

      So I agree with the sentiment: we live in a world of abstraction and new devs are coming onboard without that desire to know what happens under the covers and it'll bite us in the ass when guys like me retire.

The Windows Kernel is slower because it does more stuff, and it guarantees that your code will still work for a very long time after a feature is released. People are reticent to make big changes to things like NTFS or Named Pipes, because they literally have 30+ years of software that must remain functional - this is one of the The Biggest Value Propositions of Windows: when you run an app, or your LOB software, or anything else that your business Needs, it Fucking Works, full-stop.

The anonymous poster who made this, has a very Junior perspective on software development, reminds me of some of the conversations I heard among Interns in Windows org

  • > The Windows Kernel is slower because it does more stuff, and it guarantees that your code will still work for a very long time after a feature is released.

    I have literally loaded Debian Woody (circa 2002) onto a modern 64 bit Linux kernel from over 20 years later, and it just works.

    As for "more stuff", I'm not sure what you are referring to but I suspect it's difficult to compare. Linux's networking, hardware support, debugging and introspection facilities, files systems (off the top of my head) have always been way ahead of what Windows offers. I suspect that until DRM, Windows GUI / GPU was a long way been ahead of Linux particularly after they virtualised the GPU (in Vista?). But perhaps Linux has caught up now by slicing the cake differently.

> That's literally the explanation for PowerShell. Many of us wanted to improve cmd.exe, but couldn't.

I stopped reading here. (Not really, I read the rest, but I could have.)

Powershell is a fundamentally new thing that's miles ahead of its competitors. They couldn't have gotten there by just improving cmd incrementally. If the authors confuses boldness with being carefree about maintenance, that's on them.

  • I'm not a user or fan of powershell, but let me tell you something funny... The creator of Powershell did it as a side project at Microsoft and got DEMOTED for doing it. That's about all you need to know about the culture of Microsoft in the Steve Ballmer years.

  • PowerShell was a fundamentally new thing 15 years ago, years pass and I still use zsh and have no desire or motivation to use PowerShell.

    It is no longer hyped but also never got a killer app, so it is stuck at "exists" phase.

    • Until I started working at a SaaS company shipping to Windows enterprise customers I thought PowerShell wasn't used by anyone. Now I see it all the time. It's not fantastic, but if you're in the Windows world it beats writing CMD scripts.

      As an end user though I imagine most people use bash or some other unix-world shell, especially post WSL. The "Git Bash" distribution is surprisingly useful as an everyday Windows shell.

      3 replies →

  • The only problem with PowerShell (for me) - it came too late. If it was released with .NET 2.0 or 3.0 - maybe I used it more.

    When it appeared I already used bash(from git distribution) and python for scripting tasks

  • tbf in 2013 Powershell was a lot less powerful, and certainly didn't have the institutional support is does today.