← Back to context

Comment by GordonS

2 years ago

> How long would it take you to write the SQL for that?

I don't want to appear rude, but unless I'm missing something, this is a pretty simple SQL query, of the kind anyone with mimimal SQL experience could write off the top of their head in seconds.

I like the idea of PRQL, but I think a better example is needed to sell it.

I have a moderate amount of SQL experince, but I could not write that query at the top of my head. Maybe you misunderstand what the PRQL query is doing?

Here's the SQL it generates:

WITH table_0 AS ( SELECT , ROW_NUMBER() OVER ( PARTITION BY album_id ORDER BY milliseconds DESC ) AS _expr_0 FROM tracks ) SELECT FROM table_0 WHERE _expr_0 <= 1

If I understand PRQL correctly, it finds the longest song for each album? A simple concept, but not a simple MySQL query.

  • SELECT DISTINCT ON (album_id) * FROM tracks GROUP BY album_id ORDER BY milliseconds DESC;

    For those unfamiliar with Postgres, DISTINCT ON takes only one row for each of the groups based on the supplied columns. So in this case, it will return only one row per album_id.

    Without an ORDER BY, DISTINCT ON chooses a random row (not actually, but you can’t rely on it.) Since we ORDER BY milliseconds DESC, the first row of each group will be the longest one.

  • It's a pretty common pattern (take the top N by group with some order), so I'd expect that it's a familiar pattern to a lot of people who regularly do analytical queries in SQL. It's clearer with formatting/naming:

        WITH ranked_tracks AS (
            SELECT *,
                ROW_NUMBER() OVER ( 
                    PARTITION BY album_id
                    ORDER BY milliseconds DESC
                ) AS row_num
            FROM tracks
        )
        SELECT *
        FROM ranked_tracks
        WHERE
            row_num = 1
    

    But it is still super ugly for such a common need. If I were to add syntax to make this kind of thing easier, I'd just go for a syntax that made something like this valid:

        SELECT *
        FROM tracks
        WHERE
            ROW_NUMBER() OVER ( 
                PARTITION BY album_id
                ORDER BY milliseconds DESC
            ) = 1
    

    Which apparently the QUALIFY statement does in a few dialects.

I think the logic here is: SQL is hard and most people don't know it well. So PRQL is perhaps easier to learn.

The same logic is applied to TypeScript vs JavaScript. Or C vs assembly. Or Nano vs vim. Etc etc.

It's a quandary though. SQL is clearly difficult for most people to get their heads around. It does require a different way of thinking about data, and you can get by with a minimal SQL knowledge for a long time, especially nowadays with ORMs.

But like so many other things, making the investment is worthwhile and pays off in small and large ways, forever (so far).

  • > making the investment is worthwhile and pays off in small and large ways

    And... 'making the investment' takes time, and means that time is not able to be invested someplace else.

    If the majority of your job is writing SQL or similar (data access, etc) then sure - yes, learn more of those tools. Some folks have a wider range of responsibilities that means you have to decide what to make more time investments in, and saying 'yes' to something is necessarily saying 'no' to other things.

    • Eh, sure. Some people are not developers or don't work with systems.

      But 20 hours spent learning SQL pays off for a lifetime. Most people watch more hours of TV than that per week.

      We all make our choices, but very very few people could not find 20 hours somewhere in their lives, with zero net loss.

  • The hard part of SQL is thinking relationally. This obviously doesn’t do anything to affect that — if they actually thought this, they’d be making the same mistake as SQL itself “the reason business users don’t program is because it’s not english enough”

    The problem this is resolving, if it successfully resolves anything at all, is that the SQL language is a mess of random keywords, inconsistent syntax requirements and generates some of the worst error messages known to man. It’s an attempt at making SQL a consistent, simple language — ideally exposing the data model more directly and with less noise

    • I prefer "thinking in sets". Either way, folks try to map objects and structs to their databases (I blame ORMs personally) when the analogy just isn't so. It's like translating French to English word by word and wondering why folks have trouble understanding you.

      As for SQL being too much like English, making the syntax closer to a general purpose functional programming language isn't necessarily an improvement in my opinion.

      1 reply →

I'm not a PRQL fun. But TBH I can't write this SQL from my head. I have thousand lines of written SQL.