Comment by santaclaus

10 years ago

> the whole Dragon book

I learned compilers from Al Aho, and believe me, the class was no better in person. Mostly in-person waxing poetic about his time at Bell Labs...

14 comments

santaclaus

dgacmu 10 years ago

Dinner with Al Aho was one of the highlights of when I visited Columbia a while ago. He's simply an awesome person. And he's done some fantastic work - Aho-Corasick is a great algorithm. Alas, the Dragon Book was one of the low points of my undergraduate education, though I had a great teacher who pulled out a terrifically fun class despite the book. (Thanks, +Wilson Hsieh.)

In fairness, though -- or more an admission of how wrong we can be about things like this -- I subscribed to the "parsing is boring and solved" belief until Bryan Ford's Packrat Parsing work made me realize that it had just gotten stuck at a traffic light for a few decades.

mdcox 10 years ago
Veering a bit offtopic, but does anyone have any pointers to important recent work on parsing? There are alot of papers out there and I guess I don't know how to sift through them. I've heard the "parsing is solved" line before, but so much of my time is spent doing some type of parsing that even incremental improvements are extremely interesting.
- Apanatshka 10 years ago
  
  It really depends on the kind of parsing that you're doing.
  Full context-free grammar are supported by "generalised parsing". The older stuff is GLR by Tomita, Early parsing, the CYK algorithm. The newer stuff based on Tomita is the particular rabbit hole I stuck with for a while. I read about SGLR which eliminates lexers, GLL which is GLR but based on the LL algorithm. The people who do GLL research also did improvements on SGLR with improved speed on right-nulled grammars. Then there is the SGLR improvements with disambiguation filters, automatic derivation of error recovery with island grammars, etc. The disambiguation filters include a kind of negation, making the current implementation of SGLR for SDF capable to parsing Boolean Grammars, which are a superset of context-free grammars.
  Anyway, there's more than context-free grammars. Definitely look into data-dependant grammars! It's able to define a lot of interesting things, like network protocol formats where you parse the length of the payload, then based on that length you know how much to interpret as the payload before you read the footer of the message. And you write all of that in a more declarative way and get a nice parser generated from it.
  There is so much more, but I think I should to stop now :)
- cwzwarich 10 years ago
  
  I can probably help if you can narrow things down a bit. Are you parsing deterministic languages (e.g. for programming languages), general context-free languages, binary file formats, or something else?
  Because parsing is a field of little active research interest where most of the work happened more than 20 years ago, there are a lot of techniques from the 70s, 80s, and 90s that are relatively unknown.
  
  2 replies →
- exDM69 10 years ago
  
  > Veering a bit offtopic, but does anyone have any pointers to important recent work on parsing?
  Parser combinators are a relatively modern topic in parsing. First research into the topic was made in the late 1980s, but parser combinators became popular after Parsec [0], a practical implementation of parser combinators in Haskell. These ideas have been borrowed to many different implementations in different languages since then.
  [0] Daan Leijen: "Parsec, a fast combinator parser", 2001 http://research.microsoft.com/en-us/um/people/daan/download/...
- fanf2 10 years ago
  
  Check out ANTLR which is probably the best modern parser generator. ANTLR 4 uses "adaptive LL(*)" (aka "all star") for its underlying theory. http://www.antlr.org/papers/allstar-techreport.pdf
- sklogic 10 years ago
  
  PEG (Packrat) is hot now.
  
  5 replies →