← Back to context

Comment by Verdex

4 days ago

Parsing is an area that I'm interested in. Can you talk more about your experience getting LLMs to one-shot parsers?

From scratch LLMs seem to be completely lost writing parsers. The bleeding edge appears to be able to maybe parse xml, but gives up on programming languages with even the most minimal complexity (an example being C where Gemini refused to even try with macros and then when told to parse C without macros gave an answer with several stubs where I was supposed to fill in the details).

With parsing libraries they seem better, but ultimately that reduces to transform this bnf. Which if I had to I could do deterministically without an LLM.

Also, my best 'successes' have been along the lines of 'parse in this well defined language that just happens to have dozens if not hundreds of verbatim examples on github'. Anytime I try to give examples of a hypothetical language then they return a bunch of regex that would not work in general.

A few weeks ago I gave an LLM (Gemini 2.5 something in Cursor) a bunch of examples of a new language, and asked it to write a recursive descent parser in Ruby. The language was nothing crazy, intentionally reminiscent of C/JS style, but certainly the exact definition was new. I didn’t want to use a parser generator because (a) I’d have to learn a new one for Ruby, and (b) I’ve always found it easier to generate useful error messages with a handwritten recursive descent parser.

IIRC, it went like this: I had it first write out the BNF based on the examples, and tweaked that a bit to match my intention. Then I had it write the lexer, and a bunch of tests for the lexer. I had it rewrite the lexer to use one big regex with named captures per token. Then I told it to write the parser. I told it to try again using a consistent style in the parser functions (when to do lookahead and how to do backtracking) and it rewrote it. I told it to write a bunch of parser tests, which I tweaked and refactored for readability (with LLM doing the grunt work). During this process it fixed most of its own bugs based on looking at failed tests.

Throughout this process I had to monitor every step and fix the occasional stupidity and wrong turn, but it felt like using a power tool, you just have to keep it aimed the right way so it does what you want.

The end result worked just fine, the code is quite readable and maintainable, and I’ve continued with that codebase since. That was a day of work that would have taken me more like a week without the LLM. And there is no parser generator I’m aware of that starts with examples rather than a grammar.

  • Thanks for giving details about your workflow. At least for me it helps a lot in these sorts of discussions.

    Although, it is interesting to me that the original posting mentioned LLMs "one-shot"ing parsers and this description sounds like a much more in depth process.

    "And there is no parser generator [...] that starts with examples [...]"

    People. People can generate parsers by starting with examples. Which, again, is more in line with the original "one-shot parsers" comment.

    If people are finding LLMs useful as part of a process for parser generation then I'm glad. (And I mean testing parsers is pretty painful to me so I'm interested in the test case generation). However I'm much more interested in the existence or non-existent of one-shot parser generation.

    • I recently did something similar, but different: gave Claude some code examples of a Rust-like language, it wrote a recursive descent parser for me. That was a one-shot, though it's a very simple language.

      After more features were added, I decided I wanted BNF for it, so it went and wrote it all out correctly, after the fact, from the parser implementation.

      3 replies →

    • I guess I don't really understand the goal of "one-shot" parser generation, since I can't even do that as a human using a parser generator! There's always an iterative process, as I find out how the language I wanted isn't quite the language I defined. Having somebody or something else write tests actually helps with that problem, as it'll exercise grammar cases outside my mental happy path.

      2 replies →