Comment by nolok

13 hours ago

It's generally speaking part of the problem with the entire "XML as a savior" mindset of that earlier era and a big reason of why we left them, doesn't matter if XSLT or SOAP or even XHTML in a way ... Those were defined as machine language meant for machine talking to machine, and invariably something go south and it's not really made for us to intervene in the middle; it can be done but it's way more work than it should be; especially since they clearly never based it on the idea that those machine will sometime speak "wrong", or a different "dialect".

It looks great, then you design your stuff and it goes great, then you deploy to the real world and everything catches on fire instantly and everytime you stop one another one starts.

> It's generally speaking part of the problem with the entire "XML as a savior" mindset of that earlier era and a big reason of why we left them

Generally speaking I feel like this is true for a lot of stuff in programming circles, XML included.

New technology appears, some people play around with it. Others come up with using it for something else. Give it some time, and eventually people start putting it everywhere. Soon "X is not for Y" blogposts appear, and usage finally starts to decrease as people rediscover "use the right tool for the right problem". Wait yet some more time, and a new technology appears, and the same cycle begins again.

Seen it with so many things by now that I think "we'll" (the software community) forever be stuck in this cycle and the only way to win is to explicitly jump out of the cycle and watch it from afar, pick up the pieces that actually make sense to continue using and ignore the rest.

  • A controversial opinion, but JSON is that too. Not as bad as XML was (̶t̶h̶e̶r̶e̶'̶s̶ ̶n̶o̶ ̶"̶J̶S̶L̶T̶"̶)̶, but wasting cycles to manifest structured data in an unstructured textual format has massive overhead on the source and destination sides. It only took off because "JavaScript everywhere" was taking off — performance be damned. Protobufs and other binary formats already existed, but JSON was appealing because it's easily inspectable (it's plaintext) and easy to use — `JSON.stringify` and `JSON.parse` were already there.

    We eventually said, "what if we made databases based on JSON" and then came MongoDB. Worse performance than a relational database, but who cares! It's JSON! People have mostly moved away from document databases, but that's because they realized it was a bad idea for the majority of usecases.

    • Both XML and JSON were poor replacements for s-expressions. Combined with Lisp and Lisp macros, a more powerful data manipulation text format and language has never been created.

    • Yup, agree with everything you said!

      I think the only left out part is about people currently believing in the current hyped way, "because this time it's right!" or whatever they claim. Kind of the way TypeScript people always appear when you say that TypeScript is currently one of those hyped things and will eventually be overshadowed by something else, just like the other languages before it, then soon sure enough, someone will share why TypeScript happen to be different.

    • The fact that you bring up protobufs as the primary replacement for JSON speaks volumes. It's like you're worried about a problem that only exists in your own head.

      >wasting cycles to manifest structured data in an unstructured textual format

      JSON IS a structured textual format you dofus. What you're complaining about is that the message defines its own schema.

      >has massive overhead on the source and destination sides

      The people that care about the overhead use MessagePack or CBOR instead.

      I personally hope that I will never have to touch anything based on protobufs in my entire life. Protobuf is a garbage format that fails at the basics. You need the schema one way or another, so why isn't there a way to negotiate the schema at runtime in protobuf? Easily half or more of the questionable design decisions in protobuffers would go away if the client retrieved the schema at runtime. The compiler based workflow in Protobuf doesn't buy you a significant amount of performance in the average JS or JVM based webserver since you're copying from a JS object or POJO to a native protobuf message anyway. It's inviting an absurd amount of pain for essentially zero to no benefits. What I'm seeing here is a motte-bailey justification for making the world a worse place. The motte being the argument that text based formats are computationally wasteful, which is easily defended. The bailey being the implicit argument that hard coding the schema the way protobuf does is the only way to implement a binary format.

      Note that I'm not arguing particularly in favor of MessagePack here or even against protobuf as it exists on the wire. If anything, I'm arguing the opposite. You could have the benefits of JSON and protobuf in one. A solution so good that it makes everything else obsolete.

      1 reply →

  • There have been many such cycles, but the XML hysteria of the 00s is the worst I can think of. It lasted a long time and the square peg XML was shoved into so many round holes.

    • IDK, the XML hysteria is similar by comparison to the dynamic and functional languages hysterias. And it pales in comparison to the micro services, SPA and the current AI hysterias.

      5 replies →

Now we have "JSON as savior". I see it way too often where new people come into a project and the first thing they want to do is to replace all XML with JSON, just because. Never mind that this solves basically nothing and often introduces its own set of problems. I am not a big fan of XML but to me it's pretty low in the hierarchy of design problems.

  • The only problem with XML is the verbosity of the markup. Otherwise it's a nice way to structure data without the bizarre idiosyncracies of YAML or JSON.

    • XML has its own set of idiosyncrasies like everything being a string. Or no explicit markup of arrays. The whole confusion around attributes vs values. And many others.

      JSON has its own set of problems like lack of comments and for some reason no date type.

      But in the end they are just data file formats. We have bigger things to worry about.

    • I mean, XML has its own bizarre idiosyncrasies like the whole attribute vs child element distinction (which maps nicely to text markup but less so for object graphs).

      I would say that the main benefit of XML is that it has a very mature ecosystem around it that JSON is still very much catching up with.

> part of the problem with the entire "XML as a savior" mindset of that earlier era

I think part of the problem is focusing on the wrong aspect. In the case of XSLT, I'd argue its most important properties are being pure, declarative, and extensible. Those can have knock-on effects, like enabling parallel processing, untrusted input, static analysis, etc. The fact it's written in XML is less important.

Its biggest competitor is JS, which might have nicer syntax but it loses those core features of being pure and declarative (we can implement pure/declarative things inside JS if we like, but requiring a JS interpreter at all is bad news for parallelism, security, static analysis, etc.).

When fashions change (e.g. XML giving way to JS, and JSON), we can end up throwing out good ideas (like a standard way to declare pure data transformations).

(Of course, there's another layer to this, since XML itself was a more fashionable alternative to S-expressions; and XSLT is sort of like Lisp macros. Everything old is new again...)

Those were defined as machine language meant for machine talking to machine

i don't believe this is true. machine language doesn't need the kind of verbosity that xml provides. sgml/html/xml were designed to allow humans to produce machine readable data. so they were meant for humans to talk to machines and vice versa.

  • Yes, I think the main difference is having imperative vs declarative computation. With declarative computation, the performance of your code is dependent on the performance and expressiveness of the declarative layer, such as XML/XSLT. XSLT lacks the expressiveness to get around its own performance limitations.

It was very odd that a simple markup language was somehow seen as the savior for all computing problems.

Markup languages are a fine and useful and powerful way for modeling documents, as in narrative documents with structure meant for human consumption.

XML never had much to recommend it as the general purpose format for modeling all structured data, including data meant primarily for machines to produce and consume.