Ask YC: Good compiler books?

18 years ago

Background: CS/EE undergrad... love studying programming languages... eager to study & write a compiler...

I'm going to take a compilers class next semester, but researched the required book on amazon: http://www.amazon.com/Modern-Compiler-Implementation-Andrew-Appel/dp/052182060X

Doesn't look too promising...

Any suggestions for good compiler books or related books?

10 years ago, I loved Hanson and Fraser:

http://www.amazon.com/exec/obidos/ASIN/0805316701

This is a good tour of front-end and back-end stuff, a decent intro to code gen, and is built on a hand-written recursive-descent C parser. That it's specific to one C compiler made it better to read straight through, but I rarely refer back to it.

So far this year, I've been liking Muchnick:

http://www.amazon.com/exec/obidos/ASIN/1558603204

This is pretty heavy on the backend and codegen, which is where my head is at these days.

Compilers is a big topic. What part interests you most? Parsing? Optimization? Semantics and error detection? Code generation?

"Engineering a Compiler" by Cooper and Torczon

The Dragon Book is not the best book these days, it focuses too much on stuff you won't care about and not enough on the stuff you do care about. (e.g. it focuses a lot on parser generators and says almost nothing about analysis and optimization)

EDIT: A good followup text is "Advanced Compiler Design and Implementation" by Muchnick

  • I'll second the vote for Engineering a Compiler.

    I'm not even that interested in actually making one, but the way they work is fascinating, and this book was excellent, and I kept it after the class was over.

1. Programming Language Processors in Java: Compilers and Interpreters http://www.dcs.gla.ac.uk/~daw/books/PLPJ/

2. The definitive ANTLR reference http://www.pragprog.com/titles/tpantlr

1. gives you a good introduction in how to write parsers from scratch (without lex/yacc'ish parsing frameworks), and is probably a good warm-up before the book you mention.

2. gives you an introduction to state-of-the parsing with a framework (antlr) + a some about compilation. Note: antlr also has a nice IDE for rapid developing/prototyping of parsers - antlrworks. See http://antlr.org for more info.

A very nice parsing framework for Python is dparser. It allows you to write grammars as docstrings to methods, which makes it very easy to try out things http://www.ibm.com/developerworks/linux/library/l-cpdpars.ht... http://dparser.sourceforge.net/

Definite Clause Grammars for Prolog is also worth a look (at least for reference)

Modern Compiler Implementation by Andrew-Appel is a very good introduction book for people to get started on doing a compiler.

And if you want to be more pragmatic, here is a shorter one: http://www.cl.cam.ac.uk/teaching/2004/CompConstr/NEJ/report....

The dragon book is a MUST for compiler researcher and people who are serious on compiler, but it is hard to follow, and I do not recommend it as your first compiler book.

For ruby, you can take a look of the xruby compiler I did: http://xruby.googlecode.com (google code's server has some problem now, you may have to wait for a while)

I would say that there are two important things to consider:

1. Understanding how a compiler works without getting bogged down by programming language details - This means that you should try and look at compilers written in Standard ML or Ocaml (my favorite), since that would be much easier to follow. For example -- a datatype can be much more succinctly expressed as type any_value = Int | String | Float

rather than across 4 classes (as in the Java case).

2. Start small, and understand it in chunks. For these, a lot of web based resources are ideal. For example, to understand regular expressions, it would be nice if you were able to visualize them, and play around with them visually -

http://osteele.com/archives/2006/02/reanimator

I would also look at simple examples of interpreters, and build up from that, looking at examples of toy compilers:

http://min-caml.sourceforge.net/index-e.html

These will help, of course alongside a book like Appel's or the Dragon Book.

We used Appel's book in college, and it wasn't too bad (in the hands of decent professor, of course). Appel glosses over some subjects, but you really have to if you want to implement the base compiler in a single semester and still maybe have time to implement some of the more interesting features in the second half of the book.

About half our class didn't finish the base compiler. Fortunately for us, we were using the Java version of the text, so most of the students already knew the implementation language; I know a course at another school using the ML text, and none of them finished their compiler, because they had to learn the language at the same time as coding the compiler.

Here's an online book that covers interpretation of programming languages in Scheme (touching on functional language compilation in several places):

http://www.cs.brown.edu/~sk/Publications/Books/ProgLangs/200...

another (more interactive) option might be to pickup a standard compilers text-book e.g aho-ullman / aho-ullman-sethi and follow the intro-compilers course-work available at any of the universities (e.g. stanford). work through the exercises i.e. hack, hack, hack !

at least at stanford, you will end up writing a small compiler for a toy java like language. kind of nice. from there you can go onto more mature stuff e.g. llvm.

edit: it would be even more fun, if you could take the tests too (within stipulated times ofcourse).

The nice thing about Appel's book is that you put together a compiler as you go along - it's easier to self-study. The Dragon Book's compiler project is an appendix.

This is one of the best books I have come across. Precise and to the point. Lots of examples, and bibliographic references.

http://www.wiley.co.uk/wileychi/grune/

There is also a great book by the same author on parsing.

http://www.cs.vu.nl/~dick/PTAPG.html

  • I second "Modern Compiler Design" by Dick Grune. Without being bound to any language in particular, it gives you all the juicy details about how lex,yacc, LL(1) parses, recursive descent parsers etc really work really work, various backend/code ge techniques, (threading, BURS etc) and also how compilation worlks for various programming paradigms (functional, logic, imperative). This is the most "dense" beginner's book I've found. And it is very engagingly written too. Once you've worked through this you are in good shape for more advanced books like Muchnick's.

    Good Luck

the dragon book.

  • Can you please elaborate? Link?

    I'm really interested in Python, Ruby, JavaScript... Not looking to write a massive, enterprise-scale compiler/interpreter/parser/whatever in C, C++, or Java... possibly Java... but you get the point...

    I've been checking out projects like Mozilla Rhino, Pypy, and keeping up to date with ECMAScript 4 progress... just want a solid book or two or more... :)

  • I second that recommendation, with a preemptive shush to anyone was about to whine that it spends so much time covering parser generators. Those algorithms may be old hat, but they're nonetheless enlightening once you understand why they work.

    I do wish they covered optimization in more depth, though.

  • I second this. It was (and still is) a book almost all instructors swore by back when I was in the Uni. Even though the version had a green cover and had no dragon on it :)