Comment by AshamedCaptain
13 days ago
What I find amazing is why people continously claim glibc is the problem here. I have a commercial software binary from 1996 that _still works_ to this day. It even links with X11, and works under Xwayland.
The trick? It's not statically linked, but dynamically linked. And it doesn't like with anything other than glibc, X11 ... and bdb.
At this point I think people just do not know how binary compatibility works at all. Or they refer to a different problem that I am not familiar with.
We (small HPC system) just upgraded our OS from RHEL 7 to RHEL 9. Most user apps are dynamically linked, too.
You don't want to believe how many old binaries broke. Lot of ABI upgrades like libpng, ncurses, heck even stuff like readline and libtiff all changed just enough for linker errors to occur.
Ironically all the statically compiled stuff was fine. Some small things like you mention only linking to glibc and X11 was fine too. Funnily enough grabbing some old .so files from the RHEL 7 install and dumping them into LD_LIBRARY_PATH also worked better than expected.
But yeah, now that I'm writing this out, glibc was never the problem in terms of forwards compatibility. Now running stuff compiled on modern Ubuntu or RHEL 10 on the older OS, now that's a whole different story...
> Funnily enough grabbing some old .so files from the RHEL 7 install and dumping them into LD_LIBRARY_PATH also worked better than expected.
Why "better than expected"? I can run the entire userspace from Debian Etch on a kernel built two days ago... some kernel settings need to be changed (because of the old glibc! but it's not glibc's fault: it's the kernel who broke things), but it works.
> Now running stuff compiled on modern Ubuntu or RHEL 10 on the older OS, now that's a whole different story...
But this is a different problem, and no one makes promises here (not the kernel, not musl). So all the talk of statically linking with musl to get such type of compatibility is bullshit (at some point, you're going to hit a syscall/instruction/whatever that the newer musl does that the older kernel/hardware does not support).
Better than expected as it's mixing userlands. We didn't put the entire /usr/lib of the old system in LD_LIBRARY_PATH but just some stuff like old libpng, libjpeg and the shebang. Taking an image of an old compute node still on RHEL 7 and then dumping it a container naturally worked, but at that point it's only the kernel interface you have to worry about, not different glibc, gtk, qt and that kind of stuff.
> it's the kernel who broke things
I remember this in a heated LKML exchange, 13 years ago, look how the table has turned:
>> Are you saying that pulseaudio is entering on some weird loop if the
> returned value is not -EINVAL? That seems a bug at pulseaudio.
Mauro, SHUT THE FUCK UP!
It's a bug alright - in the kernel. How long have you been a maintainer? And you still haven't learnt the first rule of kernel maintenance?
If a change results in user programs breaking, it's a bug in the kernel. We never EVER blame the user programs. How hard can this be to understand?
To make matters worse, commit f0ed2ce840b3 is clearly total and utter CRAP even if it didn't break applications. ENOENT is not a valid error return from an ioctl. Never has been, never will be. ENOENT means "No such file and directory", and is for path operations. ioctl's are done on files that have already been opened, there's no way in hell that ENOENT would ever be valid.
> So, on a first glance, this doesn't sound like a regression,
> but, instead, it looks tha pulseaudio/tumbleweed has some serious
> bugs and/or regressions.
Shut up, Mauro. And I don't _ever_ want to hear that kind of obvious garbage and idiocy from a kernel maintainer again. Seriously.
I'd wait for Rafael's patch to go through you, but I have another error report in my mailbox of all KDE media applications being broken by v3.8-rc1, and I bet it's the same kernel bug. And you've shown yourself to not be competent in this issue, so I'll apply it directly and immediately myself.
WE DO NOT BREAK USERSPACE!
Seriously. How hard is this rule to understand? We particularly don't break user space with TOTAL CRAP. I'm angry, because your whole email was so _horribly_ wrong, and the patch that broke things was so obviously crap. The whole patch is incredibly broken shit. It adds an insane error code (ENOENT), and then because it's so insane, it adds a few places to fix it up ("ret == -ENOENT ? -EINVAL : ret").
The fact that you then try to make excuses for breaking user space, and blaming some external program that used to work, is just shameful. It's not how we work.
Fix your f*cking "compliance tool", because it is obviously broken. And fix your approach to kernel programming.
The problem of modern libc (newer than ~2004, I have no idea what that 1996 one is doing) isn't that old software stops working. It's that you can't compile software on your up to date desktop and have it run on your "security updates only" server. Or your clients "couple of years out of date" computers.
And that doesn't require using newer functionality.
But this is not "backwards compatibility". No one promises this type of "forward compatibility" that you are asking for . Even win32 only does it exceptionally... maybe today you can still build a win10 binary with a win11 toolchain, but you cannot build a win98 binary with it for sure.
And this has nothing to do with 1996, or 2004 glibc at all. In fact, glibc makes this otherwise impossible task actually possible: you can force to link with older symbols, but that solves only a fraction of the problem of what you're trying to achieve. Statically linking / musl does not solve this either. At some point musl is going to use a newer syscall, or any other newer feature, and you're broke again.
Also, what is so hard about building your software in your "security updates only" server? Or a chroot of it at least ? As I was saying below, I have a Debian 2006-ish chroot for this purpose....
> maybe today you can still build a win10 binary with a win11 toolchain, but you cannot build a win98 binary with it for sure.
In my experience, that's not quite accurate. I'm working on a GUI program that targets Windows NT 4.0, built using a Win11 toolchain. With a few tweaks here and there, it works flawlessly. Microsoft goes to great lengths to keep system DLLs and the CRT forward- and backward-compatible. It's even possible to get libc++ working: https://building.enlyze.com/posts/targeting-25-years-of-wind...
1 reply →
Windows dlls are forward compatible in that sense. If you use the Linux kernel directly, it is forward compatible in that sense. And, of course, there is no issue at all with statically linked code.
The problem is with the Linux dynamic linking, and the idea that you must not statically link the glibc code. And you can circumvent it by freezing your glibc abstraction interface, so that if you need to add another function, you do so by making another library entirely. But I don't know if musl does that.
5 replies →
This kind of compatibility is available on, of all systems, macOS.
> The trick? It's not statically linked, but dynamically linked. And it doesn't like with anything other than glibc, X11 ... and bdb.
How would that work given that glibc has gone through a soname change since then? If it's from 1996 are you sure the secret isn't that it uses non-g libc?
It has libc5 and glibc versions. It even has a version shipped as an rpm, which I guess makes it from 97. The rpm, by the way, also installs.
> It has libc5 and glibc versions
That suggests someone went to significantly more effort than "just dynamically link it".
3 replies →
can you write up a blog of how this is working? because both as a publisher and a user, broken binaries are much more the norm
Broken binaries because of glibc? you'd need to put an example, cause my point is that I'm yet to see any.
If you are talking about _any_ other library, yes, that is a problem. My point is that glibc is the only one who even has a compatibility story.
got it thanks for clarifying.