Comment by tkejser

6 days ago

Hi All

Original author here (no, I am not an LLM).

First, a clarifying point on INFORMATION_SCHEMA. In the post I make it clear that this interface is supported by pretty much every database since the 1980s. Most tools would not exist without them. When you write an article like this - you are trying to hit a broad audience and not everyone knows that there are standards for this.

But, our design goes further and treats all metadata as data. It's joinable, persisted and acts, in every way, like all other data. Of course, some data we cannot allow you to delete - such as that in `sys.session_log` - because it is also an audit trail.

Consider, by contrast, PostgreSQL's `pg_stat_statements`. This is an aggregated, in memory, summary of recent statements. You can get a the high level view, but you cannot get every statement run and how that particular statement deviated from statements like it. You also cannot get the query plan for a statement that ran last week.

To address the obvious question: "Isn't that very expensive to store?"

Not really. Consider a pretty aggressive analytical system (not OLTP) - you get perhaps 1000 queries/sec. The query text is normalised and so is the plan - so the actual query data (runtimes, usernames, skewness, stats about various operators) is in the order of few hundred bytes. Even on a heavily used system, we are talking some double digit GB every day for a very busy system - on cheap Object Storage. Your company web servers store orders of magnitude more data than that in their logs.

With a bit of data rotation - you can keep the aggregates sizes over time manageable.

What stats do we store about queries?

- Rows in each node (count, not the actual row data as that would be a PII problem) - Various runtimes - Metadata about who, when and where (ex: cluster location)

Again, these are tiny amounts of data in the grand schema of things. But somehow our industry accepts that our web servers store all that - but our open source databases don't (this level of detail is not controversial in the old school databases by the way).

Of course, we can go further than just measuring the query plan.

Performance Profiling of workers is a a concept you can talk about - so it is also metadata. Let us say you want to really understand what is going on inside a node in a cluster.

You can do this:

```sql SELECT stack_frame, samples FROM sys.node_trace WHERE node_id = 42 ```

Which returns a 10 second sample (via `perf`) of the process running on one of the cluster node.

(Obviously, that data is emphemeral - we are good at making things fast but we can't make tracing completely free)