Comment by chrisseaton

6 years ago

Most languages are context sensitive.

Most language tools are context free.

How did we go so wrong?

Languages being context sensitive results from hacking in new language features. It's a constant struggle to keep D context free :-/

  • Strictly speaking, D is not context free nor is any statically typed programming language. That said most languages, including D, have supersets that are context free and can be used to build an AST which can then be refined further with context sensitive checks (ie. variables are used after declaration) and then even further with type checks.

    Many language tools don't need a full blown language parser and can get by with a context-free parser for the language's superset. Things like autocompletion, spell checking, documentation/highlighting can all be implemented using context-free parsers.

    • I suppose we could argue about the definition of context free, but the compiler lexes it without any reference to the parser, and the parser builds the AST without any reference to symbols or semantic meaning.

      > spell checking ... using context-free parsers

      Only up to a point. D's spell checker uses the symbols in scope as its "dictionary". This means it is very good at guessing what you meant instead of what is typed, as the dictionary is very targeted. I.e. it uses the context, and the spell checker is part of the semantic pass.

      2 replies →

    • You realize the poster you are responding to is the creator of D? And has spend his career writing compilers? I'm not sure he needs to have someone explain what a context free grammar is.

      3 replies →

> Most languages are context sensitive. Most language tools are context free. How did we go so wrong?

We never did. We always knew that any real-world useful grammar is multi-level and attribute (well, it's constraint-based, though depending on your definition of "attribute grammar", those are equivalent). That's why we so much like recursive-descent parsers: adding any multi-level constraints are so easy to them - you have a full Turing-complete language at your disposal, unlike simplistic 1-dimensional DSLs of most parser generators.

Could this be viewed as supply exceeding demand?

Context-free grammars are ripe for theoretical computer science work even if they're not practically relevant. On the flipside, I suppose the constraints of context-free grammars are seen as a price not worth paying when designing a language.

Can you give an example (or three) where that lets us down? That would be very helpful to me I suspect.

  • Examples of which bit?

    Languages that are context-sensitive? C, C++, Java, JavaScript.

    Examples of tools based on the starting expectation that languages are context-free? Yacc, Bison, Jay.

    Examples of the problems this causes? Well we're using the wrong tool for the job, right from the start. Instead of using an appropriate tool the first thing we do is bend our tools out of shape. We use side-effect actions to subvert the tool's model in an uncontrolled way. We don't get the full benefits of the tool's original model and can't rely on its guarantees.