Comment by lovasoa

5 hours ago

The thing I would have liked to know is why they don't use an existing fast SQL parser. Was being slightly incompatible with all existing SQL dialects a product requirement?

6 comments

lovasoa

robbie-c 5 hours ago

Our SQL is very similar to ClickHouse SQL, in that we used ClickHouse SQL as a starting point as that's what our underlying DB is. We needed to have our own parser so that we could add additional language features on top.

bonzini 3 hours ago

I think you should clarify that (or whether) while you didn't look at the generated code, you are actually going to adjust it in the future.
How did the two approaches compare in terms of code readability?

__s 3 hours ago

This is pretty much the case with every SQL dialect

-warren 5 hours ago

I think thats exactly what indirectly happened. This guy didnt optimize the parser. Someone else did -- years ago. That work was pulled into the LLM and made it look like magic.

bonzini 3 hours ago
Note that it's not a particularly optimized algorithm: recursive descent + specialized subparser for expressions is simply the standard way to write parsers by hand. It's ANTLR which is super flexible but also dog slow.
- robbie-c 3 hours ago
  
  Yeah, one of the interesting parts to me while working on this is that the breakpoint for when it's worth writing your own parser vs accepting ANTLR's slowness has shifted massively. Previously it would have been someone's full-time job to maintain. Now with this approach you can get the best of both worlds.