← Back to context

Comment by guitarsteve

3 years ago

I’ve often heard this (YAML is a superset of JSON) but never looked into the details.

According to https://yaml.org/spec/1.2.2/, YAML 1.2 (from 2009) is a strict superset of JSON. Earlier versions were an _almost_ superset. Hence the confusion in this thread. It depends on the version…

CPAN link provided by the parent says 1.2 still isn't a superset:

> Addendum/2009: the YAML 1.2 spec is still incompatible with JSON, even though the incompatibilities have been documented (and are known to Brian) for many years and the spec makes explicit claims that YAML is a superset of JSON. It would be so easy to fix, but apparently, bullying people and corrupting userdata is so much easier.

  • Are these documented YAML 1.2 JSON incompatibilities listed / linked to somewhere?

    I assume these are something related to non-ascii string encoding / escapes?

    • They are listed in that same CPAN link

      "Please note that YAML has hardcoded limits on (simple) object key lengths that JSON doesn't have and also has different and incompatible unicode character escape syntax... YAML also does not allow \/ sequences in strings"

The JSON::XS documentation linked above reports that YAML 1.2 is not a strict superset of JSON:

> Addendum/2009: the YAML 1.2 spec is still incompatible with JSON

The author also details their issues in, ah, getting some of the authors of the YAML specification to agree.

I just checked YAML 1.2 and it seems that 1024 limit length on keys still in spec (https://yaml.org/spec/1.2.2/, ctrl+f, 1024). So any JSON with long keys is not compatible with YAML.

  • To be fair, any JSON implentation is going to have a practical limit on the key size, it's just a bit more random and harder to figure out :)

    • If you mean limited by available memory, then sure but that does not apply just to key size. If you mean something else, could you elaborate?

      5 replies →

  • Have a closer look. The 1024 limit in version 1.2 is only for implicit block mapping keys, not for flow style `{"foo": "bar"}`

In the beginning was the SGML.

Then we said it's too verbose. We named some subsets XML, HTML, XLSX.

Then we said it's still too long. So we named some subsets Markdown, and YML.

Then we said it's still too long, and made JSON.

What's wrong with subsets? Ambiguity in naming things.

https://news.ycombinator.com/item?id=26671136

  • > Then we said it's too verbose. We named some subsets XML, HTML, XSLX

    If anything, XML as an SGML subset is more verbose than SGML proper; in fact, getting rid of markup declarations to yield canonical markup without omitted/inferred tags, shortforms, etc. was the entire point of XML. Of course, XML suffered as an authoring format due to verbosity, which led to the Cambrian explosion of Wiki languages (MediaWiki, Markdown, etc.).

    Also, HTML was conceived as an SGML vocabulary/application [1], and for the most part still is [2] (save for mechanisms to smuggle CSS and JavaScript into HTML without the installed base of browsers displaying these as content at the time, plus HTML5's ad-hoc error recovery).

    [1]: http://info.cern.ch/hypertext/WWW/MarkUp/MarkUp.html

    [2]: http://sgmljs.net/docs/html5.html

  • Well, Markdown and YML and JSON are not subsets of SGML, nobody claims they are, and nobody intented them as such. So there's that.

  • First they came for the angle brackets. And I did not speak out. Because I did not use XML...

    • You didn't use XML? But We use XML to read the comments here on this HTML web page.

      But I came for the angle brackets. Because I < We, eternally.

  • > Then we said it's still too long. So we named some subsets Markdown, and YML.

    > Then we said it's still too long, and made JSON.

    JSON is older than markdown and yaml.

  • I think you'll find that in the beginning were M-expressions, but they were evil, and were followed by S-expressions, which were and are and ever will be good.

    SGML and its descendants are okay for document markup.

    XML for data (as opposed to markup) is either evil or clown-shoes-for-a-hat insane — I can’t figure out which.

    JSON is simultaneously under- and over-specified, leading to systems where everything works right up until it doesn't. It shares a lot with C and Unix in this respect.

    • If XML for data is bad, check out XML as a programming language. I think this has cropped up a few times, one that stuck with me was as templating structures in the FutureTense app server, before being acquired by OpenMarket and they switched to JSPs or something.

      Lots of <for something> <other stuff> </for> sorts of evil.