Syntax highlighting is a waste of an information channel

14 hours ago (buttondown.com)

I like using syntax highlighting when it breaks. For example, if all code below a particular line is a single color, then I probably forgot an end-quote or something. But this has become less uniquely useful due to the broader integration of parser-driven linters (e.g. tree-sitter), which—besides being able to drive highlighting—can explicitly deliver inline hints or parsing errors.

All that said, I'm one who appreciates information density! How about coloring branching code paths/call stacks?

My keyboard has a concept of "layers," which allows each key to map differently depending on the layer. I've seen this used to make a numpad or to have a QWERTY and DVORAK layer. What if highlighting was the same? Instead of competing for priority over the color channels, developers could explicitly swap layers?

  • > All that said, I'm one who appreciates information density! How about coloring branching code paths/call stacks? > > My keyboard has a concept of "layers," which allows each key to map differently depending on the layer. I've seen this used to make a numpad or to have a QWERTY and DVORAK layer. What if highlighting was the same? Instead of competing for priority over the color channels, developers could explicitly swap layers?

    I was thinking about coloring logic /scope blocks as a way to help visualize scope and flow, even if it required static analysis and a simple script it could be useful when I need to debug

For the last five years I've been working on this problem!

To solve it we need to be able to describe the structured content of a document without rendering it, and that means we need an embedding language for code documents.

I hope this doesn't sound overly technical: I'm just borrowing ideas from web browsers. I think of my project as being the creation of a DOM for code documents. The DOM serves a similar function. A semantic HTML documents has meaning independent of its rendered presentation and so it can be rendered many ways.

CSTML is my novel embedding language for code. You could think of it like a safe way to hold or serialize an arbitrary parse tree. Like HTML a CSTML document has "inner text" which this case is the source text if the program the parser saw. E.g. a tiny document might be `<Boolean> 'true' </>`. The parser injects node tags into the source text, creating what is essentially the perfect data stream to feed a syntax highlighter. To do the highlighting you print the string content if the document and use the control tags to decide on color. This is actually already how we syntax highlight the output from our own CLI as it happens. We use our streaming parser technology to parse our log output into a CSTML tag stream (in real time) and then we just swap out open and close node tags for ANSI escape codes, print the strings, and send that stream to stdout.

Here's a more complicated document generated from a real parse: https://gist.github.com/conartist6/412920886d52cb3f4fdcb90e3...

Many (most?) of these are achievable with neovim and tree-sitter (a plugin that gives you access to the AST), and surely many other editors. I have plugins installed right now that do several of the things that are mocked up here. Many more are done with virtual text and not color, but I don’t see why you couldn’t use highlighting instead.

I agree with the broader point of the article that color is underused, but the state of the art has moved way past what the author’s tools are currently configured to provide.

  • To be fair, the article is over 5 years old.

    The author seemed to be unfamiliar with tree-sitter (first appeared in 2018) and incorrectly assumed Atom used TextMate.

    Since then it's gotten much more popular and adopted by other editors.

> Color carries a huge amount of information. Color draws our attention. Color distinguishes things. And we just use it to distinguish syntax.

I didn't actually find the uncoloured circle to be that much harder to spot.

Before looking at the concrete examples, I thought:

> Understanding the syntax similarly isn't hard scanning over the code. But if the words were coloured in a way that didn't correspond to syntax, I expect I would find it distracting, in the same way as those experiments with colour words written in a different colour from what the word indicates. Trying to make use of that second information channel isn't necessarily a good idea.

> I like the aesthetic of syntax-coloured code. But also, it's reassuring. I'm not using it to help understand the syntax, but to confirm that there's software in place that understands the syntax the same way I do.

After: the rainbow parentheses honestly don't help that much, not for a language like LISP anyway. The context highlighting seems to work much better. But then, as I go on through the examples... these are all really doing the same kind of thing that syntax highlighting does! They're about the structure of the code. And I'd have to shift my expectation, but once aligned, it's again something I'd perceive the same way — as something to confirm my understanding rather than eliciting that understanding.

Perhaps being able to switch between colouring modes would change that, but I don't know how quickly I could get used to that technique. And then, the kind of code where most of these things would help is smelly anyway. I already try to avoid this kind of nesting. Maybe import and argument highlighting (which could be used at the same time, along with highlighting for class attributes and closures in Python?), though...

Reminds me of my college note taking practice. I used to take notes with about 3-5 different color pens, and my friends used to be puzzled about it (it sure looks weird to swap between pens often).

I used to reply that the color pens made it easier to keep context such as what teacher said was important, what I found difficult, when in the note I had an "aha!" moment, side comment from me, Q&A asked by student during lecture, or how certain things written down now is related to the point made earlier/later in the lecture/notes.

Text (note) is the content but our (at least mine) attention are not really made for plain text. There's so much more you can play with visual information.

Jetbtains IDEs let you configure this - my favorite use is to highlight kotlin extension functions differently than normal functions.

This kind of highlighting as a secondary information channel for compiler feedback is great. Color, weight, italics, underlines - all help increase information density when reading code.

Reminded me of ColorForth. Ist is a Forth dialect where color also carries code information

Once seen, this is so obviously missing. Why aren't we doing this?

I've used the Kate editor for years, it has a short list of strings that are auto-hilited... and I use frequently. If only I could edit and to that list ... wherever it's located!

If only there were a way I could highlite -one- string, and then use a single key to move from that instance to the next!

  • We are doing it. IntelliJ has done it for years. To highlight a string and move from one instance to the next select it, Edit -> Find Usages -> Highlight usages in file then use Next Highlighted Usage.

    Other things it can show you via highlighting:

    1. Bugs. The online static analysis will highlight code likely to be in error.

    2. Dead code. It's rendered in grey.

    3. Code that won't execute in this debugger session. Same.

    4. Identifiers you chose to temporarily highlight.

    5. Mutable vs immutable variables. Also: mutable variables that are never actually mutated.

    6. The assertion that failed in the last unit test run.

    You can also create your own smart highlighters using semantic search (it's sort of a grep for ASTs).

    And a gazillion more. People still using plain programmer text editors are missing out on a lot of features.

  • In Vim: * to highlight a word, n/C-n to move between them. You need to have the hlsearch option set.

I thought this was going to go into an extended ascii, maybe 2 bytes per character, 2 bits for rgb, a highlight flag ( as opposed to coloring the words) and an unused bit in honour of ascii.

That way the colour can be defined at write time, languages don't need to implement them, they can be like whitespace and you can use color for whatever you want. Colour would be ignored by both compile and runtime of course... unless it wasn't

Rainbow parentheses blew my mind. Why isn't every editor already doing this?

  • Terrible idea for color-blind people. And I don't even mean severely color-blind, like when you can't distiguish red from green at all, but something more common and less severe, like deuteranomaly - where it's shades of these colors that are hard to distinguish.

    • The color blind would still see the parenthesis at least like they see them currently without the color coding though, given they are distinguishable from the background by lightness.

    • Those people are able to distinguish suggested colors - the example is using primary colors. And at worst, it would be the same as all parenthesis being the same color (black).