← Back to context

Comment by zetafunction

3 months ago

Disclaimer: I work on Chrome and I have contributed a (very) small number of fixes to libxml2/libxslt for some of the recent security bugs.

Speaking from personal experience, working on libxslt... not easy for many reasons beyond the complexity of XSLT itself. For instance:

- libxslt is linked against by all sorts of random apps and changes to libxslt (and libxml2) must not break ABI compatibility. This often constrains the shape of possible patches, and makes it that much harder to write systemic fixes.

- libxslt reaches into libxml and reuses fields in creative ways, e.g. libxml2's `xmlDoc` has a `compression` field that is ostensibly for storing the zlib compression level [1], but libxslt has co-opted it for a completely different purpose [2].

- There's a lot of missing institutional knowledge and no clear place to go for answers, e.g. what does a compile-time flag that guards "refactored parts of libxslt" [3] do exactly?

[1] https://gitlab.gnome.org/GNOME/libxml2/-/blob/ca10c7d7b513f3...

[2] https://gitlab.gnome.org/GNOME/libxslt/-/blob/841a1805a9a9aa...

[3] https://gitlab.gnome.org/GNOME/libxslt/-/blob/841a1805a9a9aa...

Sounds like libxslt needs more than just a small number of fixes, and it sounds like Google could be paying someone, like you, to help provide the necessary guidance and feedback to increase the usability and capabilities of the library and evolve it for the better.

Instead Google and others just use it, and expect that any issues that come up to be immediately fixed by the one or two open source maintainers that happen to work on it in their spare time. The power imbalance must not be lost on you here...

If you wanted to dive into what [3] does, you could do so, you could then document it, or refactor it so that it is more obvious, or remove the compile time flag entirely. There is institutional knowledge everywhere...

  • or, the downstream users who use it and benefit directly from it could step up, but websites and their users are extremely good at expecting things to just magically keep working especially if they don't pay for it. it was free, so it should be free forever, and someone set it up many moons ago, so it should keep working for many more magically!

    // of course we know that, as end-users became the product, Big Tech [sic?] started making sure that users remain dumb.

> libxslt is linked against by all sorts of random apps and changes to libxslt (and libxml2) must not break ABI compatibility. This often constrains the shape of possible patches, and makes it that much harder to write systemic fixes.

I’m having trouble expressing this in a way that won’t likely sound harsher than I really want, but, uh, yes? That’s the fundamental difference between maintaining a part of the commons that anybody can benefit from and a subdirectory in a monorepo. The bazaar incurs coordination costs, and not being able to go and fix all the callers is one of them.

(As best as I can see, Chrome’s approach is largely to make everything a part of the monorepo, so maintaining a part of the commons may not be high on the list of priorities.)

This not to defend any particular ABI choice. Too often ABI is left to luck and essentially just happens instead of being deliberately designed, and too often in those cases we get unlucky. (I’m tempted to recite an old quote[1] about file formats, which are only a bit more sticky than public ABI, because of how well it communicates the amount of seriousness the subject ought to evoke: “Do you, Programmer, take this Object to be part of the persistent state of your application, to have and to hold, through maintenance and iterations, for past and future versions, as long as the application shall live?”)

I’m not even deliberately singling out what seems to me like the weakest of the examples in your list. It’s just that ABI, to me, is such a fundamental part of lib-anything that raising it as an objection against fixing libxslt or libxml2 specifically feels utterly bizarre.

[1] http://erights.org/data/serial/jhu-paper/upgrade.html

  • It's one thing if the library was proactively written with ABI compatibility in mind. It's another thing entirely if the library happens to expose all its implementation details in the headers, making it that much harder to change things.

    • When i first encountered the early GNOME 1 software back in the very late 1990s, and DV (libml author) was active, i was very surprised when i asked for the public API for a library and was told, look at the header files and the source.

      They simply didn’t seem to have a concept of data hiding and encapsulation, or worse, felt it led to evil nasty proprietary hidden code and were better than that.

      They were all really nice people, mind you—i met quite a few of them, still know some—and the GNOME project has grown up a lot, but i think that’s where libxml was coming from. Daniel didn’t really expect it to be quite so widely used, though, i’m sure.

      I’ve actually considered stepping up to maintain libxslt, but i don’t know enough about building on Windows and don’t have access to non-Linux systems really. Remote access will only go so far on Windows i think, although it’d be OK on Mac.

      It might be better to move to one of the Rust XML stacks that are under active development (one more active than the other).

    • No, it's the same in both cases. ABI stability is what every library should provide no matter how ugly the ABI is.