Comment by maximilianroos

4 years ago

This is awesome! I'll add a link to it on PRQL.

I guess the biggest difference between FunSQL (and similarly dbplyr) and PRQL is that the former needs a Julia (or R) runtime to run.

I really respect the library and keen to see how it develops.

FunSQL.jl requires Julia to run (obviously as it is a Julia library) but it produces standard SQL so Julia in this case is just an implementation language.

I have re-implemented parts of FunSQL in Python and OCaml (the one I have ended up using) and have added a concrete syntax similar to what you have in PRQL.

    from employees
    define
      salary + payroll_tax as gross_salary,
      gross_salary + benefits_cost as gross_cost
    where gross_cost > 0 and country = 'usa'
    group by title, country
    select
      title,
      country,
      avg(salary) as average_salary,
      sum(salary) as sum_salary,
      avg(gross_salary) as average_gross_salary,
      sum(gross_salary) as sum_gross_salary,
      avg(gross_cost) as average_gross_cost,
      sum(gross_cost) as sum_gross_cost,
      count() as count
    order by sum_gross_cost
    where count > 200
    limit 20

But, in my mind, the biggest difference between PRQL and FunSQL is the way FunSQL treats relations with `GROUP BY` - as just another kind of namespaces, allowing to defer specifying aggregates. A basic example:

    from users as u
    join (from comments group by user_id) as c on c.user_id = u.id
    select
      u.username,
      c.count() as comment_count,
      c.max(created_date) as comment_last_created_date

The `c` subrelation is grouped by `user_id` but it doesn't specify any aggregates - they are specified in the `select` below so you have all selection logic co-located in a single place.

I think this approach is very powerful as it allows you to build reusable query fragments in isolation but then combine them into a single query which fully specifies what's being selected.

Writing an alternative syntax is straight forward. Perhaps prototype PRQL using xi's excellent FunSQL backend? This way it's working out of the gate. Once syntax+semantics are pinned, writing another backend in the language of your choice would then be easier. Getting the backend correct is non-trivial work, and xi has done this already. Besides, we need a sandbox syntax anyway, so it might be fun to collaborate.