← Back to context

Comment by blintz

4 years ago

This is a nice idea, especially given all the work people have done recently to make in-language querying nicer (Spark comes to mind).

My only gripe is the 'auto-generated' column names for aggregates. This seems like a recipe for disaster - what if there is already (as there almost certainly will be) named "sum_gross_cost"? The behavior also just seems rather unexpected and implicit. My suggestion would be simple syntax that lets you optionally give a name to a particular aggregate column:

    ...
    filter gross_cost > 0
    aggregate by:[title, country] [
        average salary,
        sum gross_salary,
        average gross_cost,
        let sum_gc = sum gross_cost,
        count,
    ]
    sort sum_gc

While it might seem a little uglier, it seems much more sustainable in the long run. If this is really too gross, I'd advocate some token other than underscore that is reserved for aggregation variables; perhaps `sum@gross_cost` or `sum#gross_cost`.

> My only gripe is the 'auto-generated' column names for aggregates

For what it's worth, a similar problem already exists with SQL. Something simple like

  select count(*) from my_table;

automatically aliases the column to `count`, even if `my_table` has a column called `count`.

In practice, I don't think this is a major problem.

Definitely — giving the option of naming them is great.

I'm not sure whether we should force naming? When I'm writing a query often I'm fine with something auto-generated when starting out.