Comment by vendiddy
2 years ago
The thing I hate most about SQL is lack of composability.
In most languages it's easy to pull out functions.
In SQL you end up with a giant hard to comprehend mess.
I think the underlying relational concepts in SQL are sound but I'd love to see ideas like PRQL that aim make SQL easier to write and maintain.
Stored procedures and functions are nice but don't allow the basic idea of breaking a large query apart into smaller logical components.
Your statement about breaking large queries apart is wrong. You can write queries with CTES to improve readability, and extract CTEs into functions that can be selected as queries get too large and unwieldly. SQL is just as composable as any other language.
The thing that's lacking right now is the tooling for managing/testing/deploying database code. There are solutions out there and the supabase folks have been working to make things better but database first development still has some hurdles in terms of DX.
One issue with functions though is that they can change performance in unpredictable ways. For example a colleague of mine recently altered a function I had written that was used in multiple hot-path queries. The change he made accidentally caused the function to no longer be inline-able on PostgreSQL. Once the function couldn't be inlined then the PostgreSQL planner wasn't able to select the appropriate indices and the performance of several of the queries exploded by about 100x.
So while it's true it can be composed etc the current state of the art planners struggle except under very simple/constrained scenarios.
If a function is marked stable it should not impact the query plan at all, since stable functions are essentially in-lined before planning. If logic is unstable a view is probably going to be a better abstraction than a function.
6 replies →
I'm not sure how adding another layer of abstraction helps with that.
The problem you encoutered is that the rewritten function was either no longer table-valued, or else it was no longer deterministic (which is what that big list of rules for inlining really means). But that problem doesn't go away by adding a layer of abstraction. The need to understand relational determinism doesn't disappear. The need to understand SARGability doesn't go away. You can't really abstract the problem away.
Do you have any links to a basic example on using CTEs and functions to keep SQL maintainable?
I've used CTEs, but I had not tried breaking up an SQL query into functions. Didn't know that was possible!
For whatever reason, I feel like I end up with a giant blob of SQL when writing SQL and it's incredibly frustrating.
You can just use chatgpt to rewrite sql with ctes, and extract functions. It's quite good at it, particularly gpt4.
That being said, CTEs are a really good way to write complex queries. They let you tag bits of query with meaningful names, and each thing you tag is accessible to every CTE after it so you can build up an almost imperative data flow by just doing select transforms one after another. That way you're building hard queries from the bottom up rather than the top down.
1 reply →
> You can write queries with CTES to improve readability, and extract CTEs into functions that can be selected as queries get too large and unwieldly.
Personally I'd go for breaking them into views. IIRC as of around postgres 11-13 they're no longer a barrier for the query planner.
You are right, views are a good choice in a lot of instances. Functions give you more behavioral flexibility, easy multiple version support (in postgres) and fewer issues with ddl dependencies, but views are semantically clearer, easier to work with and give good usage flexibility.
Composition is available in sql, but works a bit different than in a procedural language. In sql you express sets of data, composition consists of defining subsets that you compose into more complex sets. Views and CTEs are the tools for composition in SQL.