Comment by krab

11 years ago

The point with namespaces leads me to think that the main problem with XML is that on the surface it looks very simple but in fact it's not. It is tempting to take shortcuts to process the XML: for example parsing with regexes, looking at the namespace prefix and not at its definition, producing XML without proper escaping. There are also some gotchas like certain characters being non-representable in XML.

I personally like XML and XSLT (2.0) but to be able to work efficiently you need to spend some time learning which is not obvious on the first sight.

What about the alternatives?

JSON has a big advantage which is its unambiguous automapping to objects. This benefit is not that apparent in languages like Java where you'd still declare a class to represent either the XML or the JSON document. Moreover, there are projects which essentially try to bring schema and namespaces to JSON. JSON-LD is an example of a namespace without an explicit support in the underlying format. There is even a command-line tool jq big part of which is an engine similar to XPath.

S-expressions if used widely would probably go the same path as JSON - recreating a lot of what is considered as bloat in XML.

Another mentioned alternative was a custom text format. I assume the author meant just to design a format from scratch. I wrote that to use XML efficiently, you need to put in some work. But compared to making a backwards (and forwards?) compatible text format which correctly handles malformed and malicious input requires much more effort.