Comment by scotty79

2 months ago

When I need to parse something nowadays I always end up with parser combinators. They just make so much sense.

What language do you use parser combinators in, and what kind of grammar do you parse usually? Nom was terribly verbose and unergonomic even by Rust's standards. Haskell's Megaparsec/Parsec is good but yeah, it's Haskell, you need to handle multiple monads (Parser itself is monadic, then your AST state, and maybe some error handling) at once and that's where I got confused. But I appreciated the elegance.

I experimented with PCs in Haskell and Rust (nom), then moved on to parser generators in Rust (pest.rs), Ocaml (Menhir), Haskell (Happy) and finally ended up with python's Lark - the speed of experimenting with different syntax/grammars is just insane.

  • Parser combinators is more of a concept than a library. You could make your own supporting the stuff you need. I like writing programs in languages I don't know or I barely know. I usually just take one of the popular libraries in any given language.

    For Rust I used Nom and I didn't mind it all that much although I noticed it's quite baroque. If I had more to write I'd probably make some wrappers or macros of my own for most commonly used Nom snippets.

  • I've used tree-sitter for generating my parsers in Rust, and just working with the untyped syntax tree it generates, and gives you error-tolerance for free. It's a bit of a setup at first tho, requiring an extra crate for the generated parser, but editing it from there saves so much time.

    • What do you mean exactly by "error-tolerance"? Is it like, each node is wrapped into a result type, that you have to match against each time you visit it, even though you know for a fact, that it is not empty or something like that?

      I suppose that one of the pros of using tree-sitter is its portability? For example, I could define my grammar to both parse my code and to do proper syntax highlighting in the browser with the same library and same grammar? Is that correct? Also it is used in neovim extensively to define syntax for a languages? Otherwise it would have taken to slightly modify the grammar.

      2 replies →