Comment by Verdex

4 days ago

Thanks for giving details about your workflow. At least for me it helps a lot in these sorts of discussions.

Although, it is interesting to me that the original posting mentioned LLMs "one-shot"ing parsers and this description sounds like a much more in depth process.

"And there is no parser generator [...] that starts with examples [...]"

People. People can generate parsers by starting with examples. Which, again, is more in line with the original "one-shot parsers" comment.

If people are finding LLMs useful as part of a process for parser generation then I'm glad. (And I mean testing parsers is pretty painful to me so I'm interested in the test case generation). However I'm much more interested in the existence or non-existent of one-shot parser generation.

7 comments

Verdex

steveklabnik 4 days ago

I recently did something similar, but different: gave Claude some code examples of a Rust-like language, it wrote a recursive descent parser for me. That was a one-shot, though it's a very simple language.

After more features were added, I decided I wanted BNF for it, so it went and wrote it all out correctly, after the fact, from the parser implementation.

Verdex 4 days ago
Can you give more info?
How big of a number is "some"?
Also what kind of prompts were you feeding it? Did you describe it as Rust like? Anything else you feel is relevant.
[Is there a GitHub link? I'm more than happy to do the detective work.]
- steveklabnik 4 days ago
  
  Like three or four. very simple language: main function whos value is the error code, functions of one argument returning one value, only ints, basic control flow and math.
  I just opened the repo, here's the commit that did what I'm talking about: https://github.com/steveklabnik/rue/commit/5742e7921f241368e...
  Well, the second part anyway, with the grammar. It writing the lexer starts as https://github.com/steveklabnik/rue/commit/a9bce389ea358365f..., it was basically this program.
  If I wrote down the prompts, I'd share them, but I didn't.
  Please ignore the large amount of llm bullshit in here, since it was private while I did this, I wasn't really worried about how annoying and slightly wrong the README etc was. HEAD is better in that regard.
  
  1 reply →

wrs 4 days ago

I guess I don't really understand the goal of "one-shot" parser generation, since I can't even do that as a human using a parser generator! There's always an iterative process, as I find out how the language I wanted isn't quite the language I defined. Having somebody or something else write tests actually helps with that problem, as it'll exercise grammar cases outside my mental happy path.

Verdex 4 days ago
The comment that started this whole thread off mentioned LLMs oneshot-ing parsers. I didn't think an LLM could one shot a parser and I am interested in parsers which is why I asked about more info.
It's not a goal of mine but because of interests in parsing I wanted to know if this was something that was happening or if it was hyperbole.
- wrs 3 days ago
  
  Well, I mean, it sort of did one-shot the parser in my case (with a few bugs, of course). It just didn't one-shot the parser I wanted, largely because my definition was unclear. It would be interesting to see how it did if I went to the trouble of giving it a truly rigorous prompt.