Comment by dagss
3 months ago
I have been writing a ton of SQL -- implementing a lot of the business logic of a (stream processing) application in it. I really really like it, especially that I bring the computation to the data instead of the data to the computation.
I often talk to developers who hate that idea though. They want me to instead move all the data to the backend, for a massive IO hit, just so that the computations can be expressed in a "real" programming language.
I think SQL the concept is good but SQL the language is the problem. There are so many awkward things in it -- not strange as it has had no competition in 40(?) years!! The mental model of the program I write is fine but I really need to overlook the syntax and to the program I am really writing to see any elegance...
What we need I think is a properly designed programming language, designed for existing databases (Postgres, MSSQL) compiling to SQL dialects. I see some contenders but they all have some attachment to a special domain, such as not allowing data modifications (PreQL) or being associated with other databases.
Itching to do it myself, but it's a lot of work and a long long road to any adoption with no guarantee of success, and no revenue stream that I can think of.
The most popular backend languages were made by large companies, but I think coding in SQL is in a catch-22 where it will be frowned upon until there is a better language and no better language until it is more popular..
Me too me too :D
There's a lot that is very right about SQL, but a few clunky bits around the edges.
CTEs made a lot of difference, and window functions - which can be a bit head bending - made difficult things a tiny bit easier.
I'm using BigQuery, which supports structs and arrays, but only recently allowed arrays to be grouped by, although there is still no equality check etc.
BigQuery is slowly adding more sugar, like aggregate UDFs and polymorphic UDFs using ANY TYPE parameters etc, and I find myself putting more reused logic into tidy functions, but my pet want is for temporary functions to be declared and scoped like CTEs so they integrate a lot better into tooling like DBT that wants everything to be in one statement.
And the one most productive thing they could add? Allowing you to specify null behaviour on JOIN USING. (Having to spell out foo.bar IS NOT DISTINCT FROM bar.bar on a join is unobvious and ugly. Something like USING (bar RESPECT NULLS) would be so much nicer.)
I can't put my finger on it but I think many people see this as two operating modes, for lack of a better term. The more monolithic and enterprisey your solution (and with bespoke DBMS), the more it leans towards anything more complex than an index and maybe a couple triggers - and the more microservices-y where every small service owns its own database (and only half of them are RDBMS) the less complex code is desired in the DB itself, because you're also migrating off of single instances/clusters a lot (just taking a relatively dumb data dump with you, or even just adding new replicas, ship-of-theseus-like).
PRQL is awesome. There's a similar competitor whose name I can't find now.