← Back to context

Comment by Mikhail_Edoshin

9 hours ago

XSLT is not bad, but XML, unfortunately, is normally misused, so XSLT is tainted as it has to be a part of that misuse.

The true role of XML are grammar-based notations. These occur in two places: when a human gives data to a machine and when a machine produces data for a human. This is where XML is used despite its often mentioned shortcomings; for example, many notations to describe the user interface are based on XML. This is convenient, because user interfaces are created manually. (I am not mentioning text markup, it is well known.)

Yet XML was often used as a notation for machine-to-machine exchange. For example, the ONIX book description standard. Here data are moved between two computers, yet for some reason they have to form grammatically correct phrases according to a set of grammar rules. Computers do not need grammar. They do just fine with non-grammatical data, like a set of tables. It is way simpler for them; parsing or generating grammar, even explicit, is pure overhead for data exchange and is only necessary when data enters or leaves the computed pipeline.

So, to your examples: configuration in XML is actually fine, but IPC is not. Configuration is written by hand, IPC happens between machines. IPC specification, on the other hand, is also a good fit for XML.

That said, XML and thus XSLT has another flaw: it is way too verbose and has no good way to format it. Conciseness was an explicit no-goal but now we can say it was a mistake.

I thought Tim Bray's XML spec was one of the most beautiful tech documents I'd every seen when I saw it for the first time. Adding namespaces at that point in history though was a disaster. Back then developers just weren't used to that kind of rigor (when I first started coding Java I had to go to a website run by frickin' NASA to get a clear explanation of how namespaces worked.)

It didn't help that Microsoft dropped a stick of over-complicated standards that tried to bring RPC into XML. RPC has always been a cursed concept because between (1) trying to be intellectually coherent and (2) caring about performance RPC systems become incomprehensible and it doesn't matter if it is Sun RPC, DCOM, CORBA, "Web Services", Protocol Buffers, etc.

The fact that the "REST economy" is intellectually incoherent and could care less about performance seems to have helped it succeed. Right now I just wrote a javascript function that looks like

   const get_item = async (item_id) => {...}

and it does

   GET /item/{item_id}

and I have a Java function on the server that looks like

   Item getItem(String item_id)

and is tagged with some annotations that make it get called when that GET request. Jackson lets me write an Item as an "anemic domain object" that gets turned into the exact JSON I want and the only real complaint I have is that the primitive types are anemic so representing dates is a hassle.

The XML abuse I've seen at work is truly horrifying. We use protobuf for most of our inter-service IPC, but for one particular team one of their customers demands the use of XML so that it can be run through some XSLT "security" filters, so they have to transform a fairly large protobuf object into XML, run it through said filters, and then convert it back to protobuf :( I weep every time I think about it.

  • It is probably impossible to find a tech stack that has not seen horrible abuse somewhere. :D

    Granted, it did seem that XML got more heavily abused than some other options for a while. I am curious if that is just a by product of when it was introduced. That or just the general proliferation of how many front end developers we have. (I hate that I am pushing that to almost be a complaint. I certainly don't mean it that way.)