Comment by adamddev1
8 hours ago
How did we get so much better at writing compilers? Was it a better understanding of how to make syntax trees with ADTs etc?
8 hours ago
How did we get so much better at writing compilers? Was it a better understanding of how to make syntax trees with ADTs etc?
I think significant improvements are
- not writing compilers in assembly
- not requiring overlays
- knowing how previous compilers produced fast code (Web search doesn’t give me conclusive answers, but that Fortran compiler may have been the first to do loop unrolling and common subexpression elimination)
- having way more memory, CPU and disk available
- possibly: spending less time looking at optimizations. I expect IBM tried hard to make the output of their compiler to match the performance of hand-written assembly
The best link I could find is https://en.wikipedia.org/wiki/Fortran#FORTRAN_IV:
“In particular, the FORTRAN H compiler played an important role in the development of certain kinds of optimization approaches, such as allocating a specific set of registers to hold the values of variables while in a loop. Overall, the compiler had three levels of possible optimization, as Fortran compiler developers had learned early on that the ability to turn off optimization was a necessity, since it drove up compilation times considerability for program runs that often were not going to work anyway. Even with the larger amount of main memory available to it, the FORTRAN H compiler was still organized via a number of overlays.”
Writing an optimizing production ready compiler doesn't seem to be an easy task, even today. I mean you can fork a compiler or look at the code, but maintaining your own one alone doesn't seem to be realistic.
>> - not writing compilers in assembly
Sure, but you still generate the machine code, right? You still have to master the instructions and their specifics of the target CPUs.
> Sure, but you still generate the machine code, right? You still have to master the instructions and their specifics of the target CPUs.
You do, but self-hosted compilers tend to have two huge benefits:
1) they tend to be easier to reason about, being written in a high-level language
2) they exercise the code, and usually even seldom-used parts of the code, to make problems more noticeable
2 replies →
The author is comparing a 1990 hypothetical compiler to a 1970-ish compiler. The late 1960s and early 1970s are essentially when all of the foundational parser theory gets laid down. By the late 1970s, we're getting into autoparallelization and autovectorization research. Monotone dataflow analysis was developed in the 1970s as well. To be a little bit glib, basically what happened is compiler theory is really birthed starting in the 1970s; if you wanted to track down most of the techniques in the Dragon book, I suspect the vast majority of them originate in that timeframe.
There is a second shift that occurs around 2000-2005-ish, which is the transition of optimizing compilers from an instruction-based semantics to a more value-based semantics, in that modern optimizers make no real attempt or guarantee to preserve the structure of code. For example, an if statement may happily be converted into an expression lacking an if entirely.
I think the reason writing a compiler is easy today is the theory I learned in compilers class. How to do context free grammars, the concept of abstract syntax trees, the pattern of writing a recursive descent parser and having a lexer that only looks one symbol ahead and has a peek function. On top of that we have experience with lots of languages and type systems to draw from when constructing a new one.
I was just doing some research and apparently all of this stuff was invented around the late 60s and so in the 70s it was still new and by the 90s it was standard practice. The dragon book came out in 1986 and spelled it all out in one place.
Today we have the benefit of knowing the right ideas to use from the start and confidence that if you follow the formula it will all work out.