Comment by 10000truths
1 year ago
The pipeline syntax as presented is nicer than the status quo, but I'd prefer a syntax that models query execution as a directed graph of operations. Doing so would not only make some of the more complex SQL query constructs much more straightforward to represent:
* Joins can be modelled as a "cross-referencing" operation that consume two (or more) data streams and produce a single data stream
* CTEs can be modelled as producing multiple data streams
* Recursive CTEs can be modelled as cycles in the execution graph
> a directed graph of operations
What syntax do you know that can represent a dag in text?
Something like the DOT language used in GraphViz
https://graphviz.org/doc/info/lang.html
Yes DOT (and the other UML whatever languages) are absolutely the only extant examples that make an attempt. But again, if you look at DOT you'll see it doesn't actually do anything syntactically - it just has syntax for edge lists.
Any syntax with a let operator to name stuff or a lambda abstraction.
That only gives you trees not DAGs - you can't do fan-in (there's no way to "share" let bound names).
2 replies →
A = select * from tbla
B = select * from tblb
C = select * from A join B
I guess CTEs already provide that (even if they're a bit clunky).
Yes we already have that, it's called `async def`.
Check out Substrait, it sounds like what you’re describing.