Comment by blintz
4 years ago
This is a nice idea, especially given all the work people have done recently to make in-language querying nicer (Spark comes to mind).
My only gripe is the 'auto-generated' column names for aggregates. This seems like a recipe for disaster - what if there is already (as there almost certainly will be) named "sum_gross_cost"? The behavior also just seems rather unexpected and implicit. My suggestion would be simple syntax that lets you optionally give a name to a particular aggregate column:
...
filter gross_cost > 0
aggregate by:[title, country] [
average salary,
sum gross_salary,
average gross_cost,
let sum_gc = sum gross_cost,
count,
]
sort sum_gc
While it might seem a little uglier, it seems much more sustainable in the long run. If this is really too gross, I'd advocate some token other than underscore that is reserved for aggregation variables; perhaps `sum@gross_cost` or `sum#gross_cost`.
> My only gripe is the 'auto-generated' column names for aggregates
For what it's worth, a similar problem already exists with SQL. Something simple like
automatically aliases the column to `count`, even if `my_table` has a column called `count`.
In practice, I don't think this is a major problem.
Which database does that? MSSQL doesn't assign any name to the column in this case.
Postgres does this
Definitely — giving the option of naming them is great.
I'm not sure whether we should force naming? When I'm writing a query often I'm fine with something auto-generated when starting out.