Comment by data-ottawa

24 days ago

I’ve been working on this.

MCP to ensure whoever is using the agent is authorized. Then I do sql cleaning and rewriting plus validation to ensure only validated query structures and no DDL/DML.

Then when the query is written I apply limits for budget (generally large reads).

Finally, the MCP uses a token with restricted access to a whitelist of tables, with either row level security enabled or table valued functions to apply additional constraints.

I make sure to hide all the sql statements that allow the agent to read table metadata and such.

And then it also needs to be approved by the user in the client.

I don’t think you can do this at scale for many users or low trust users, so they get read only parquet extracts with duckdb.

2 comments

data-ottawa

nico 24 days ago

Are you doing that as your own personal tooling? Are you open sourcing it? Would be happy to take a look and maybe contribute as well

data-ottawa 24 days ago

I am doing this already for an internal tool that accesses a bigquery data warehouse. Long term this will be a feature my company sells.
I will not open source it since it is a paid feature, but I use sqlglot for most of the query parsing, validating and rewriting.