← Back to context

Comment by conartist6

8 hours ago

A lot of people dislike that decision not to include comments in JSON, but I think while shocking it was and is totally correct.

In a programming language it's usually free to have comments because the comment is erased before the program runs; we usually render comments in grey text because they can't change the meaning of the program.

In a data language you have no such luxury. In a data language there's no comment erasure happening between the producer and the consumer, so comments are just dangerous as they would without doubt evolve into a system of annotations -- an additional layer of communication which would then not be standardized at all and which then would grow into a wild west of nonstandard features and compatibility workarounds.

I don't dislike the decision at all, FWIW! For data interchange it's totally reasonable. But it does make JSON ill-suited for a bunch of applications that JSON has been forcefully and unfortunately applied to.

> so comments are just dangerous as they would without doubt evolve into a system of annotations -- an additional layer of communication which would then not be standardized at all and which then would grow into a wild west of nonstandard features and compatibility workarounds

IIRC Douglas Crockford explicitly stated that he saw people initially using comments for a purpose like ad hoc preprocessor directives.

Could you imagine hitting a rest api and like 25% of the bytes are comments? lol

  • Worse than that - people will start tagging "this value is a Date" via comments, and you'll need to parse ad-hoc tags in the comments to decode the data. People already do tagging in-band, but at least it's in-band and you don't have to write a custom parser.

  • HTML and JS both have comments, I don't see the problem

    • And both are poor interchange formats. When things stay in their lane, there is no "problem." When you try to make an interchange format using a language with too many features, or comments that people abuse to add parsable information (e.g. "type information") then there is a BIG problem.

      1 reply →

  • > Could you imagine hitting a rest api and like 25% of the bytes are comments? lol

    That's pretty much what already happens. Getting a numeric value like "120" by serializing it through JSON takes three bytes. Getting the same value through a less flagrantly wasteful format would take one.

    I guess that's more than 25%. In the abstract ASCII integers are about 50% waste. ASCII labels for the values you're transferring are 100% waste; those labels literally are comments.

    If you're worried about wasting bandwidth on comments, JSON shouldn't be a format you ever consider, for any purpose.

    lol

> In a programming language it's usually free to have comments because the comment is erased before the program runs

That's inherent to the language specification, but it isn't inherent to the document. You have to have a system with rules that require that erasure.

Nothing prevents one from mandating a system that strips those comments out of JSON. You could even "compile" JSON to, I don't know, BSON or msgpack or something.

Just as nothing prevents one from creating tooling to, say, extract type annotations from comments in a dynamically typed language.

> In a data language there's no comment erasure happening between the producer and the consumer, so comments are just dangerous as they would without doubt evolve into a system of annotations -- an additional layer of communication which would then not be standardized at all and which then would grow into a wild west of nonstandard features and compatibility workarounds.

But there's nothing stopping you from commenting your JSON now. There's no obligation to use every field. There can't be, because the transfer format is independent of the use to which the transferred data is put after transfer.

And an unused field is a comment.

    {
      "customerUUID": "3"
      "comment": "it has to be called a 'UUID' for historical reasons"
    }

If this would 'without doubt' evolve into a system of annotations, JSON would already have a system of annotations.

> that decision not to include comments in JSON, but I think while shocking it was and is totally correct.

Yaml is fugly, but it emerged from JSON being unsupportive of comments. Now we’re stuck with two languages for configuration of infrastructure, a beautiful one without comments so unusable, the other where I can never format a list correctly on the first try, but comments are ok.

  • JSON is obviously perfectly usable, given how widely it's used. Even Douglas Crockford suggested just using a JSON interpreter that strips out comments, if you need them.

    And if you want something like JSON that allows comments, and you aren't working on the web, Lua tables are fine.

> while shocking it was and is totally correct

Agreed —— consider how comments have been abused in HTML, XML, and RSS.

Any solution or technology that can be abused will be abused if there are no constraints.

No, it was obviously and flagrantly incorrect, as evidenced by the success of interchange formats that do allow for comments, including many real world systems that pragmatically allow comments even when JSON says they shouldn't. This is Stockholm Syndrome.

But what can we expect from a spec that somehow deems comments bad but can't define what a number is?

  • How do you feel numbers are ill defined in json? The syntactical definition is clear and seems to yield a unique and obvious interpretation of json numbers as mathematical rational numbers.

    A given programming language may not have a built in representation for rational numbers in general. That isn't the fault of json.

    • I can't really tell what you're trying to say; JSON also has no representation for rational numbers in general. The only numeric format it allows is the standard floating point "2.01e+25" format. Try representing 1/3 that way.

      The usual complaint about numbers not being well-defined in JSON is that you have to provide all numbers as strings; 13682916732413492 is ill-advised JSON, but "13682916732413492" is fine. That isn't technically a problem in JSON; it's a problem in Javascript, but JSON parsers that handle literals the same way Javascript would turn out to be common.

      Your "defense", on the other hand, actually is a lack in JSON itself. There is no way to represent rational numbers numerically.

  • As long as they stay comments there's no harm. As soon as they become struct tags and stripping comments affects the document's meaning you lose the plot.