← Back to context

Comment by mort96

3 years ago

Or, GNU could just recognise their extremely central position in the GNU/Linux ecosystem and just not. break. everything. all. the. time.

It honestly really shouldn't be this hard, but GNU seems to have an intense aversion towards stability. Maybe moving to LLVM's replacements will be the long-term solution. GNU is certainly positioning itself to become more and more irrelevant with time, seemingly intentionally.

The issue is more subtle than that. The GNU and glibc people believe that they provide a very high level of backwards compatibility. They don't have an aversion towards stability and in fact, go far beyond most libraries by e.g. providing old versions of symbols.

The issue here is actually that app compatibility is something that's hard to do purely via theory. The GNU guys do compatibility on a per function level by looking at a change, and saying "this is a technical ABI break so we will version a symbol". This is not what it takes to keep apps working. What it actually takes is what the commercial OS vendors do (or used to do): have large libraries of important apps that they drive through a mix of automated and manual testing to discover quickly when they broke something. And then if they broke important apps they roll the change back or find a workaround regardless of whether it's an incompatible change in theory or not, because it is in practice.

Linux is really hurt here by the total lack of any unit testing or UI scripting standards. It'd be very hard to mass test software on the scale needed to find regressions. And, the Linux/GNU world never had a commercial "customer is always right" culture on this topic. As can be seen from the threads, the typical response to being told an app broke is to blame the app developers, rather than fix the problem. Actual users don't count for much. It's probably inevitable in any system that isn't driven by a profit motive.

  • I think part of the problem is that by default you build against the newest version of symbols available on your system. So it's real easy when you're working with code to commit yourself to some symbols you may not even need; there's nothing like Microsoft's "target a specific version of the runtime".

    • I really, really miss such a feature with glibc. There are so many times when I just want to copy a simple binary from one system to another and it won't work simply because of symbol versioning and because the target has a slightly older glibc. Just using Ubuntu LTS on a server and the interim releases on a development machine is a huge PITA.

      1 reply →

  • > What it actually takes is what the commercial OS vendors do (or used to do): have large libraries of important apps that they drive through a mix of automated and manual testing to discover quickly when they broke something.

    There are already sophisticated binary analysis tools for detecting ABI breakages, not to mention extensive guidelines.

    > And, the Linux/GNU world never had a commercial "customer is always right" culture on this topic.

    Vendors like Red Hat are extremely attentive towards their customers. But if you're not paying, then you only deserve whatever attention they choose to give you.

  • > As can be seen from the threads, the typical response to being told an app broke is to blame the app developers, rather than fix the problem.

    This is false. Actual problems get fixed, and very quickly at that.

    Normally the issues are from proprietary applications that were buggy to begin with, and never bothered to read the documentation. I'd say to a paying customer that if a behaviour is documented, it's their problem.

    • > Normally the issues are from proprietary applications that were buggy to begin with, and never bothered to read the documentation. I'd say to a paying customer that if a behaviour is documented, it's their problem.

      … But that's exactly why Win32 was great; Microsoft actually spent effort making their OS was compatible with broken applications. Or at least, Microsoft of long past did; supposedly they worked around a use-after-free bug in SimCity for Windows 3.x when they shipped Windows 95. Windows still has infrastructure to apply application-specific hacks (Application Compatibility Database).

      I have no reason to believe their newer stacks have anything like this.

    • The issue I see most often is someone compiled the application on a slightly newer version of Linux and when they try to run it on a slightly older machine it barfs saying that it needs GLIBC_2.31 and the system libc only has up to GLIBC_2.28 or something like that. Even if you aren't using anything that changed in the newer versions it will refuse to run.

      12 replies →

  • Well you talk about Windows, that was true in the pre-Windows 8 era. Have you used Windows recently?

    I bought a new laptop, and decided to give Windows a second chance. With Windows 11 installed, there were a ton of things that didn't worked. To me it was not acceptable for a 3000$ laptop. Problems with drivers, blue screens of death, applications what just didn't run properly (and commonly used applications, not something obscure). I never had these problems with Linux.

    I mean we talk about Windows that is stable mostly because we use Windows versions after they are out since 5 years and most of the problems were fixed. Good, now companies are finishing the transition to Windows 10, not Windows 11, after staying with Windows 7 for years. After 10 years they will probably move to Windows 11, when most of its bug are fixed.

    If you use a rolling-release Linux distro, such as ArchLinux, some problems on new software are expected. It's the equivalent of using an insider build of Windows, with the difference that ArchLinux is mostly usable as a daily OS (it requires some knowledge to solve the problems that inevitably arrive, but I used it for years). If you use let's say Ubuntu LTS, you don't have these kind of problems, and it mostly run without any issue (less issues than Windows for sure).

    By the way, maintaining compatibility has a cost: have you ever wandered because a full installation of Ubuntu that is a complete system with all the program that you use, an office suite, driver for all the hardware, multimedia players, etc is less than 5Gb while a fresh install of Windows is minimum 30Gb but I think nowadays even more?

    > And then if they broke important apps they roll the change back or find a workaround regardless of whether it's an incompatible change in theory or not, because it is in practice.

    Never saw Microsoft do that: whey will simply say that it's not compatible and the software vendor has to update. That is not a problem by the way... OS developer should move along and can't maintain backward compatibility forever.

    > The GNU and glibc people believe that they provide a very high level of backwards compatibility.

    That is true. It's mostly backward compatible, having a 100% backward compatibility is not possible. Problems are fixed as they are detected.

    > What it actually takes is what the commercial OS vendors do (or used to do): have large libraries of important apps that they drive through a mix of automated and manual testing to discover quickly when they broke something.

    There is one issue: GNU can't test non-free software for obvious licensing and policy issues (i.e. an association that endorses free software can't buy licenses of proprietary software to test it). So a third party should test it and report problems in case of broken backward compatibility.

    Keep in mind that binary compatibility is something that is not fundamental on Linux, since it's assumed that you have the source code of everything and in case you recompile the software. GNU/Linux born as a FOSS operating system, and was never designed to run proprietary software on it. There are edge cases where you need to run a binary for other reasons (you lost the source code, compiling it is complicated or takes a lot of time) but surely are edge cases and not a lot of time should be spent to address them.

    Beside that glibc it's only one of the possible libc that you can use on Linux: if you are developing proprietary software in my opinion you should use MUSL libc, it has a MIT license (so you can statically link it into your proprietary binary) and it's 100% POSIX compliant. Surely glibc has more feature, but probably your software doesn't use them.

    Another viable option is to distribute your software with one of the new packaging formats that are in reality containers: snap, flatpack, appimage. That allows you to distribute the software along with all the dependencies and don't worry about ABI incompatibility.

    • I literally run on Windows insider for two my laptops - primary one is on beta channel and auxiliary laptop is on alpha channel. Both running Windows 11 and had 10 running before. Auxiliary one lives on insider for I think 5 years of not 6 and definitely had issues, like Intel wifi stopped working and some other minor ones, but main one, had, I guess 3-4 BSODs over 2 years and around 10 times not waking up from sleep. That's pretty much all of the issues.

      For me it's impressive and I cannot complain on stability.

    • I believe that appimage still contains the glibc compatibility issues. I've read through appimage creation guides which suggest compiling on the oldest distro possible as glibc is forward compatible but not backwards.

    • You cut out a key word:

      > Linux is really hurt here by the total lack of any unit testing or UI scripting standards.

      > standards

      I've been very impressed reading how the Rust developers handle this. They have a tool called crater[1], which runs regression tests for the compiler against all Rust code ever released on crates.io or GitHub. Every front-facing change that is even slightly risky must pass a crater run.

      https://github.com/rust-lang/crater

      Surely Microsoft has internal tools for Windows that do the same thing: run a battery of tests across popular apps and make sure changes in the OS don't break any user apps.

      Where's the similar test harness for Linux you can run that tests hundreds of popular apps across Wayland/X11 and Gnome/KDE/XFCE and makes sure everything still works?

      10 replies →

    • Right, but Linux (the OS) doesn't have unit tests to ensure that changes to the underlying system doesn't break the software on top. Imagine if MS released a new version of Windows and tons of applications stopped functioning. Everyone would blame MS. The Linux community does it all the time and just says that it's the price of progress.

      4 replies →

    • Well, let's see. What do I know about this topic?

      I've used Linux since the Slackware days. I also spent years working on Wine, including professionally at CodeWeavers. My name can still be found all over the source code:

      https://gitlab.winehq.org/search?search=mike%20hearn&nav_sou...

      and I'm listed as an author of the Wine developers guide:

      https://wiki.winehq.org/Wine_Developer%27s_Guide

      Some of the things I worked on were the times when the kernel made ABI changes that broke Wine, like here, where I work with Linus to resolve a breakage introduced by an ABI incompatible change to the ptrace syscall:

      https://lore.kernel.org/all/1101161953.13273.7.camel@littleg...

      I also did lots of work on cross-distribution binary compatibility for Linux apps, for example by developing the apbuild tool which made it easy to "cross compile" Linux binaries in ways that significantly increased their binary portability by controlling glibc symbol versions and linker flags:

      https://github.com/DeaDBeeF-Player/apbuild/blob/master/Chang...

      So I think I know more than my fair share about the guts of how Win32 and Linux work, especially around compatibility. Now, if you had finished reading to the end of the sentence you'd see that I said:

      "Linux is really hurt here by the total lack of any unit testing or UI scripting standards"

      ... unit testing or UI scripting standards. Of course Linux apps often have unit tests. But to drive real world apps through a standard set of user interactions, you really need UI level tests and tools that make UI scripting easy. Windows has tons of these like AutoHotKey, but there is (or was, it's been some years since I looked) a lack of this sort of thing for Linux due to the proliferation of toolkits. Some support accessibility APIs but others are custom and don't.

      It's not the biggest problem. The cultural issues are more important. My point is that the reason Win32 is so stable is that for the longest time Microsoft took the perspective that it wouldn't blame app developers for changes in the OS, even when theoretically it could. They also built huge libraries of apps they'd purchased and used armies of manual testers (+automated tests) to ensure those apps still seemed to work on new OS versions. The Wine developers took a similar perspective: they wouldn't refuse to run an app that does buggy or unreasonable things, because the goal is to run all Windows software and not try to teach developers lessons or make beautiful code.

      2 replies →

GNU / glibc is _hardly_ the problem regarding ABI stability. TFA is about a library trying to parse executable files, so it's kind of a corner case; hardly representative.

The problem when you try to run a binary from the 90s on Linux is not glibc. Think e.g. one of Loki games like SimCity. The audio will not work (and this will be a kernel ABI problem...). The graphics will not work. There will be no desktop integration whatsoever.

  • > Think e.g. one of Loki games like SimCity. The audio will not work (and this will be a kernel ABI problem...). The graphics will not work. There will be no desktop integration whatsoever.

    I have it running on an up to date system. There is definitely an issue that it's a pain to get working, especially for people not familiar with the cli or ldd and such, as it wants a few things that are not here by default. But once you get it the few libs it needs and ossp to emulate the missing oss in the kernel, there is no issue with gameplay, graphics or audio aside from the intro video that doesn't run.

    So I guess the issue is that the compatibility is not user friendly ? Not sure how that should be fixed though. Even if Loki had shipped all the needed lib with the program, it would still be an issue not to have sound due to distro making the choice of not building oss anymore.

    • It would seem from your example that the issue is a lack of overall commitment to compatibility. There are Windows games from 1990s that still run fine w/sound - which is not surprising, given that every old Win32 API related to sound is still there, emulated as needed on top of the newer APIs. It sounds like Linux distros could do this here as well, since emulation is already implemented - they just choose to not have it set up out of the box.

    • > So I guess the issue is that the compatibility is not user friendly ?

      I don't understand this point -- this is like claiming Linux has perfect ABI compatibility because at the end of the day you can run your software under a VM or a container. Of course everything has perfect compatibility if you go out of your way using old installations or emulation layers -- people under Windows actually install the Wine DX9 libraries since they have better compatibility than the native MS ones. But this means nilch for Windows' ABI compatibility record (or lack thereof).