← Back to context

Comment by chriswarbo

6 years ago

> there's only one syntax for the language(syntax extension and mixed-language documents aside) and it too follows a linear order

I'm of the opinion that programming languages should have two "official" representations: the usual text-based representation, and a machine-readable concrete syntax tree. The latter should be an existing format, e.g. JSON, XML, s-expressions, etc.

It should be "official" in the sense that it's standardised in the same way as the text format, language implementations (compilers/interpreters) should accept it alongside the text format, and they should have a mode which converts the text format into the machine-readable format (and optionally the other way). This is important, since lots of code 'in the wild' only makes sense after feeding it through particular preprocessors, specifically-configured build tools, sed-heavy shell scripts, etc. such that the only tool that can even parse the code correctly is the compiler/interpreter (and even that might need a bunch of tool-specific flags, env vars, config files, etc.!). This makes tooling much harder than it needs to be, and any "unofficial" workarounds will need constant work to keep up with changes to the language.

I say concrete syntax trees since we want to impose as little meaning as possible on the tokens, since that makes tooling more robust in the face of things like macros/custom syntax, new language features, incomplete or malformed code, etc.

Many Basic implementations had this feature. The SAVE command saves a binary (tokenized) version of the source code. "SAVE file, A" saves in ASCII format.

  • The SAVE command in old BASICs used to tokenize the individual BASIC statements and inbuilt functions, eg PRINT, GOTO, CHR$(). It could also tokenize line numbers. But it certainly didn't do things like tokenize a FOR/NEXT loop or anything that went beyond a line break (eg GOSUB/RETURN).

    Just typing those words sends me back too many years to a TTY on a PDP11/10 and when you "saved" a program by:

    1. Typing LIST but not hitting CR

    2. Start the tape punch and press HERE-IS a few times to get a leader

    3. Hitting CR

    4. Waiting for the listing to finish

    5. Hitting HERE-IS a few times to get a trailer

    6. Folding the paper tape neatly :)