← Back to context

Comment by vector_spaces

6 years ago

I saw this great interview recently with Anders Hejlsberg at MSFT on how modern compiler construction differs from what is taught in traditional University courses. Is this what you're alluding to?

After watching that interview, it's strange to read a comment like yours. While the architecture is completely different, it doesn't frankly seem like that big a leap to go from "senior engineer with X years working on thing that's tangentially related to compilers" to being able to be productive in a reasonable amount of time working with the new architecture. What am I missing?

https://youtu.be/wSdV1M7n4gQ

I will say that I asked Shriram Krishnamurti about why books on interpreters and compilers tend to avoid the question of parsing like the plague and found his response a little unsatisfactory.

All these celebrated texts like SICP, EOPL, and Lisp in Small Pieces avoid the question of parsing altogether by focusing on s-expression based languages.

His response was effectively that parsing is over-emphasized in other books, and it's truly orthogonal to other PL activities. Which I can understand, but in practice when I need a parser in my day job I usually don't have much say in the structure of the input "language" (since I'm usually parsing something coming from another source that I have little control over). And it would seem if you have an instructor in a compilers course with this point of view, what other class would give you an opportunity to learn parsing in a formal setting?

  • A full half of the compilers course I took in undergrad was about {LA,S,}LR; I think "parsing is over-emphasized" is alive and well... That said, it didn't cover non-table-based parsing much. (On the other hand, I'm unaware of a non-table-based parsing algorithm that can statically warn you about problems with the grammar (ambiguities etc).)

  • Hey, I know Shriram! We were students at Rice together. Not saying we knew each other well, more like passing acquaintances, but I recognize his name.

    Anyway, as I recall from compilers classes (it has been ~30 years), the Dragon book heavily emphasized parsing, but the professor said "I'll assign you a parse to do by hand for homework, but and that's it".

    The overall attitude was a combo of 1) parsing theory is also covered in another theory of computation class (where the classic textbook was at the time "Introduction to Automata Theory, Languages and Computation" by Hopcroft and Ullman, but these days may also be Sipser's book; 2) there were so many other topics to cover in compilers that a week or so of parsing was all the time budget allowed; and 3) the research focus of the professor's work was in other areas, and compiler research in general was the same.

    So not enough attention to parsing might be a side effect (perhaps unfortunate). Even if there is enough material to have an entire class on parsing.

    I took the advanced course in compilers too, which was more like a topics class - a research paper every week or so. I don't remember anything about parsing there either.

  • > why books on interpreters and compilers tend to avoid the question of parsing like the plague

    They avoid it like old boring anecdote everyone read yet in their childhood in a dragonbook.

    > parsing is over-emphasized in other books, and it's truly orthogonal to other PL activities.

    Right, "parsing science" != "compiler science". Parsing science isn't even a subset of compiler science. It's just another discipline which touches (someone may even say overlaps) with it. Consider parsing a natural language. (Which btw any human does, but compiler stuff is surely not for a every human.)

It's pretty hard to pick up compiler skills because very little of it is written down. It takes a lot of time (a few years) working with code bases and papers to absorb it.

  • > very little of it is written down

    > and papers to absorb it.

    I smell a contradiction. But I'm glad that even well-known industry people see it like that. It's not a common flu you can pick up at university library. It's an arcane lore you need to travel faraway to dig into ruins of Library of Alexandria to find the knowledge of.

    • I don't see the contradiction - very little of it is written down, and so you need to spend a lot of time scraping around for the disparate parts that are written down and trying to absorb as much as possible for them. There are very few (in some parts of the field none) books that bring together all the information.

      1 reply →

Yes, more validation - compilers as taught in universities need improvement.

So, a startup of 10 people should hire and train people for months. Raise a $2M seed round, have 12-18 months to show ProductMarket fit and as a founder teach all engineers their jobs? How many engineers go into compilers after school? So university education is all they know. The industry is built on qualified engineers. NVIDIA ceo was ahead of curve on graphics card, google founders ahead in search, why are compilers ones behind the curve?

  • >> Why are compilers ones behind the curve?

    I suspect it’s because there aren’t a lot of high quality libraries you can integrate into the backend of compiler tools that don’t run into license issues pretty fast. Imagine if GNU binutils was more permissively licensed and as modular as clang? Then developing novel, non-GPL’d compiler infrastructure could depend on BFD - the boring part that working on won’t bring bonafide improvements to your new compiler. Another factor is that LLVM’s quality and ubiquity has reduced the monetary and technical upside to pursuing new opportunities in compiler development.

Huh, he's talking a lot about how doing everything incrementally is infeasible; isn't this how Rust's compiler/tooling infrastructure works?