Comment by zabzonk
9 days ago
> which queries produced which rows
I doubt many real-world applications could tolerate the amount of data/performance degradation this implies. If you need this (and I can't think why you would), then I think writing your own logging code is the answer, rather than lumbering everyone else with it.
To clarify a point: We store the row count of each operator in the query - not the actual row (that would indeed be madness!). Though we DO have tracing you can control that allow you to enable rows for very specific diagnostics - but the stream is massive and you need to opt in for that.
With that clarified, the logging not as large as you might think (see my other response).
Think of web servers - they routinely store much larger log streams than this with metadata about each hit. You rely on that stream to do various forms of web analytics - yet you would not dream of rolling your own - nor worry about the small overhead you are already paying.
Example:
SELECT x, y FROM foo JOIN bar USING (k)
In a query like this, you will have these operators:
SCAN of foo
SCAN of bar
JOIN (on k between Foo and Bar)
Hope that clarifies the point... That's 3 operators, and we log the row counts of each. That in turn allows you answer questions about your data model and how it is being used.
If they did it efficiently, then they might have a game changer.
Well, you can do it efficiently as you like, but if the data is to be historically accurate on a writeable table then if we are recording
it is likely to require a copy of foobar at the time of the query, or doing amazing things with the transaction log. I agree this would be a programming tour de force, but I question its utility.
Punishing someone for returning all rows in a large table is a feature, not a bug. ;-)
Having all state preserved at any given point in time is also useful in itself. It costs almost nothing if the table doesn’t change, and continuous point-in-time query capability can be extremely useful.
Access audits, when important, are very important. I’d be fine with the system blocking access after someone tries to return all lines in a table, even if only for the reason it might be someone trying to export data without a good reason.