Comment by data-ottawa

4 days ago

I’ve been working on this.

MCP to ensure whoever is using the agent is authorized. Then I do sql cleaning and rewriting plus validation to ensure only validated query structures and no DDL/DML.

Then when the query is written I apply limits for budget (generally large reads).

Finally, the MCP uses a token with restricted access to a whitelist of tables, with either row level security enabled or table valued functions to apply additional constraints.

I make sure to hide all the sql statements that allow the agent to read table metadata and such.

And then it also needs to be approved by the user in the client.

I don’t think you can do this at scale for many users or low trust users, so they get read only parquet extracts with duckdb.

Are you doing that as your own personal tooling? Are you open sourcing it? Would be happy to take a look and maybe contribute as well

  • I am doing this already for an internal tool that accesses a bigquery data warehouse. Long term this will be a feature my company sells.

    I will not open source it since it is a paid feature, but I use sqlglot for most of the query parsing, validating and rewriting.