Comment by AceJohnny2
11 years ago
I don't know if you're being ironic about JSON. Note that jkarneges, who comments elsewhere in this thread, is the creator of Psi [1], arguably the best XMPP-focused messaging client.
The value/burden of XML has always been a topic of debate for XMPP. In retrospect, I think it contributed to its lack of appeal, though the extensibility and readbility (ehm, arguably) it provided were unique back then.
I've long wondered about which alternative base protocols could be used in place. JSON is OK, but may be as much a fad as XML. I've wondered if ASN.1 could be used, but ProtoBufs sound like they're a better fit [2] in that they're simpler, more space-efficient, and backwards-and-forwards compatible (and thus extensible, XMPP's main feature) In fact, it's what Google already uses themselves.
[1] http://psi-im.org/ [2] https://groups.google.com/forum/#!topic/protobuf/eNAZlnPKVW4
What is the purpose of layering your chat protocol over another protocol at all?
SMTP has no "base protocol" in this sense. HTTP, nothing (unless you count RFC 822).
It's hard to think there protocols would have had the same life time if they were based on XML, JSON, or protobufs. (Yeah, HTTP over XML, that should be enough to give you nightmares. But welcome to DAV and XMPP.)
If you're looking for a happy medium between the readability of JSON and XML and the efficiency of ASN.1 and protobufs, take a look at canonical S-expressions[1].
There's an advanced representation, which looks like this: (message (header (sender "Billy Joe Bob") (sent "2015-03-26T12:02:00Z")) (body "Hey guys! Let's meet up for lunch!")). It's possible to encode any byte string using Base64 or hex. It's also possible to encode types with data: (message (header (sender "Billy Joe Bob") (sent "2015-03-26T12:02:00Z")) (body [text/html]"<p>Hey guys! Let's meet up for lunch!</p>"))
While there are multiple advanced encodings for the same data (e.g. foo or "foo" or |Zm9v| or #666f6f#), there is a _single_ canonical encoding for any datum: the messages above would be (7:message(6:header(6:sender13:Billy Joe Bob)(4:sent20:2015-03-26T12:02:00Z))(4:body35:Hey guys! Let's meet up for lunch!)) and (7:message(6:header(6:sender13:Billy Joe Bob)(4:sent20:2015-03-26T12:02:00Z))(4:body[9:text/html]42:<p>Hey guys! Let's meet up for lunch!</p>)).
A huge advantage of this canonical encoding is that it's amenable to cryptographic hashing and signing; a weakness of JSON is that one has to layer requirements atop JSON itself (e.g. alphabetising object properties) in order for two parties to be able to hash the same datum and get the same value.
Another advantage of canonical S-expressions is that it's straightforward to define a mapping between them and HTML: "<p class='foo'>This is a <em>nifty</em> paragraph.<br /></p>" could be represented as ((p (class foo)) "This is a " (em nifty) paragraph. (br)). There are other possible mappings between S-expressions and HTML, of course, but I like that one. Another might be (p (/ (class foo)) "This is a " (em nifty) paragraph. (br)).
[1] http://people.csail.mit.edu/rivest/Sexp.txt
> there is a _single_ canonical encoding for any datum: the messages above would be (7:message(6:header(6:sender13:Billy Joe Bob)(4:sent20:2015-03-26T12:02:00Z))(4:body35:Hey guys! Let's meet up for lunch!))
This reminds me a lot of bencode, with the advantage for bencode that it doesn't need any fiddling for non-printable characters: no more base64, no more hex.
The base64 & hex stuff is only used for the advanced, human-readable bits; on the wire it's just straight length-encoding and byte strings.
I'd say that bencode's advantage is a built-in standard for integer encoding (with canonical S-expressions one must decide between ASCII decimals or little/big-endian bit strings), and a clearer standard for a dictionary/map/hash (a canonical S-expression would probably use an alist-like structure like (map (foo bar) (baz quux)), but one could also go with (map foo bar baz quux), (map (foo bar baz quux)) or some other encoding.
XML is horrendous. Especially to parse/scrape. JSON on the over hand is a breeze.
Only if you don't understand XML.
* XML has a formal, class-based description language (XML Schema) with strong typing, polymorphism, and - best of all - self-descriptiveness.
* Languages like Java have a seamless, bidirectional mapping to XML Schema.
* XML has a rediculously powerful and elegant transformation language (XSLT) which makes scraping, selective data extraction and processing trivial.
The problem with XML is that people who require instant satisfaction are not willing to invest the time to understand it, and the mature tooling ecosystem around it.
The XML ecosystem solves problems, and contains solutions to problems, that the JSON / JavaScript ecosystem can only dream of, and is hell-bent on partially re-inventing.
If you need strong-typing and self-descriptiveness, you're out of luck with JSON. Binding JSON to a strong-typed language like Java or Haskell is a total ball-drag compared to XML + Schema.
I don't see why I can't use XML, JSON, MsgPack or YAML.
Couldn't the parsing be a pluggable component? Just set a standard on how data is structured and let third-parties figure out how data is parsed.
And that would improve the XMPP adoption and experience by ... ?
Are you saying mom and dad aren't using XMPP because the message is sent using XML based stanzas? Facebook is ditching XMPP because of the X?
It doesn't matter?
As much as I can agree that XML doesn't prevent user adoption, it may very well prevent developer adoption; if developers are not willing to bother with XML or any part of the protocol, they may very well give up and develop for some other platform.
Yes, that may sound futile, but in the end the developers are the people who build everything. It is my strong belief that HTTP, IRC, SMTP and bittorrent (among others) have thrived because of their utter simplicity, to the point we're embarassed today because we've ended up with so much under-specified crap on top of them. Still, they deliver. As always "worse is better".
>And that would improve the XMPP adoption and experience by ... ?
Saving battery on mobile devices for one.
4 replies →