Comment by IshKebab
2 days ago
I don't think parser combinators are any more difficult in imperative languages like Python than functional ones? I found a few for Python and C++.
2 days ago
I don't think parser combinators are any more difficult in imperative languages like Python than functional ones? I found a few for Python and C++.
Don’t know about difficult, but at least less elegant. Lazy evaluation, type inference, abstractions like Functor/Applicative/Alternative/Monad make them so incredibly natural to work with in a language like Haskell. Sure, they exist in other languages (made a few myself) but it’s not the same.
Yes, I'm writing a parser in Unison (in the Haskell family) for ASN.1 at the moment. It's so clean to write parsers with parser combinators.
For example Asn1Type can be of form Builtin, Referenced, or Constrained. So a sum type.
Assuming you have the parsers for Builtin, Referenced, and Constrained, you're golden. (Haskell PCs look very similar, possibly even exactly the same minus different parenthesis for operator precedence reasons.
Compare Parsy for Python, particularly the verbosity (this parses SELECT statements in SQL):
).combine_dict(Select)
The same thing in a FP-style language would be something like
which would feed into something like
The general notion of a “combinator” comes from functional programming, so parser combinators are native to FP. They can definitely be done in imperative languages also, but in my experience (C# specifically) are much more clunky. A dynamically typed language like Python might be suitable, because the interpreter is lenient, but I’m dubious about C++.
One thing you will definitely need in any language to produce a concise parser is the ability to define custom operators, like >>. and .>>. These are the “combinators”.
All Turing-complete languages are technically equivalent in capability, but that doesn't mean the same task will cost the same effort in each language.
The closer a task is to the limit of your ability to handle complexity, the more you will benefit from tools that simplify the task.
The choice of programming language unfortunately what tools are possible to implement ergonomically (in terms of both implementation and interface).
The way you would implement them in Python or C++ is using their functional facilities. Or you'd write a library that emulated them.
The key feature is first class functions, so you can have combinators like "or()", "and()", "repeat()", "optional()", etc, which take a function pointer as a parameter, as well as the input character stream; and returns the matched characters from the input stream, as well as the rest of the stream. When all combinators have this function signature, you can compose them easily, and writing the parser is becomes a matter of transcribing the ebnf grammar.