← Back to context

Comment by ahelwer

5 years ago

Does anybody know of good introductory resources on error recovery techniques when writing a parser & interpreter? Bonus points if they use combinators instead of a hand-written parser.

Not a resource, but depending on what kind of parser you are doing, maybe an idea.

The thing I'm going to try out next for one of my batch parsers (written recursive descent style as many or most parsers in the wild) is to have AST nodes of kind "invalid". The idea is that the parser should always complete, with minimal in-line error checking, even when I have say "require_token_kind()" or similar calls in my parser code.

Maybe not even an "invalid" kind, but an additional "invalid" flag, so that even those invalid nodes can still have all the regular node-kind-depending structure on them.

The other obvious idea is to use a parser generator. But I'm pretty sure I wouldn't be happy on that route - from the little experiments that I've done, I infer it's too bureaucratic, too much boilerplate for me to suffer.

The more general advice for error handling is that you should try to make APIs that require very little in-line error handling. Instead, keep errors behind the APIs, maybe collect them in a list, or only keep the first or last error, and allow to handle them at a later point (out-of-line).

And boom, most of the talk about smart syntax and type systems and other clever systems to make error handling more convenient suddenly becomes obsolete.

Yet more abstractly, don't try to build systems that make cumbersome things more convenient, but think how you can do with less of these cumbersome things. (That works wonders for runtime performance, too).

  • >The more general advice for error handling is that you should try to make APIs that require very little in-line error handling. Instead, keep errors behind the APIs, maybe collect them in a list, or only keep the first or last error, and allow to handle them at a later point (out-of-line).

    I guess you wanted to say that instead of printing just what's wrong at some point, be smarter and try to build some handier error handling like attaching error to ast's node and then maybe do something with it

    • The important point is the temporal decoupling between the time where the error is detected and the time where it is handled.

I've actually been meaning to write something up about this. I've found you can get fairly good error messages from a parser combinator style parser just by saving the deepest location you successfully parsed to and the parser you failed in.