Giving LLM agents direct, autonomous access to a real production databases with write access seems insane to me.
NO ONE, agent or human, should have direct write access to production databases outside of emergency break glass scenarios. This is why we have stored routines and API layers to pre-define what writes are allowed. The facts that agents CAN autonomously write to a database does not imply that they should.
For the point about query optimization, again your agents should not be issuing random queries against a production database. We have had the concept of separate analytics databases with different architectures to support exporatory queries for decades.
I agree and hope this is the case for anything serious enough. I also don't see this changing any time soon.
There are ways to give safe access to the data, at least read-only, that don't involve production risk and don't sacrifice privacy. For example, database branches with anonymization. Instead of accessing the prod/staging db, the agent creates a branch and has read/write access to that.
(disclaimer: I work at Xata, where we offer copy-on-write branches for Postgres, and the agent use-cases are the most popular right now)
I’m a DBRE. I spend a good portion of my day with a shell into one or more prod databases. The schema definitions in code are scattered between ORM model definitions, Alembic migrations, and Liquibase migrations, so the only reliable way I have of understanding a schema as it exists is to view it. Plus, I am very comfortable with SQL, and the various system catalogs of both MySQL and Postgres, so it’s a ton easier to work with.
Truly sensitive customer information is encrypted, and on an isolated DB cluster that no one has regular access to. I also operate with a read-only grant, because manual writes to a prod DB is generally a terrible idea.
So what do you do for "okay, we need to run this script that we've decided is a necessary operation". Special account? Everything go through the build server? I've been looking for tooling for "I need to do a production operation but I want it to have proper interlocks and reviews".
My guess is that if the database is subject to auditing then LLM access (obviously writes in particular, but even reads come with exfiltration risks) will be a hard "no" and instant red flag. When it's a person, there is a sense of accountability and opportunity for remediation.
I suppose that LLMs will be treated as a code artifact and liability will shift upstream towards who deployed/approved the access in the first place. Even though code is essentially deterministic, making that association fairly simple, it's going to boil down to this same paradigm.
Perhaps governance rules will evolve to even explicitly forbid it, but my gut feeling is that for what the future determines to be "practical" reasons (right or wrong) LLMs will warrant an entirely new set of rules to allow them to be in the chain at all.
+ EDIT: both my wife and I have experience in this area and the current answer is companies like KPMG don't have an answer yet. Existing rules do help (e.g. there better be good documented reasons why it was used and that access was appropriately scoped, etc), but there is enough ambiguity around these tools so they say "stay tuned, and take caution".
LLM agents are unlocking demand and supply for applications that wouldn't have been possible before due to time constraints though. There's a growing demand for single user or smaller scoped apps where giving LLM agents direct access means velocity. The failure/rollback model is much easier with these as long as we have good backup hygiene.
It's not news that if you just give all developers at a company write access to the production databases, owner permissions on all resources, etc. that velocity can be increased. But at what cost?
The reason we don't do that in most cases is that "move fast and break things" only makes sense for trivial, non-critical applications that don't have any real importance, like Facebook.
> There's a growing demand for single user or smaller scoped apps where giving LLM agents direct access means velocity. The failure/rollback model is much easier with these as long as we have good backup hygiene.
This makes no sense to me. For anything that has sensitive payment or personally identifieable data, direct access to DB is potentially illegal.
> The failure/rollback model is much easier with these as long as we have good backup hygiene.
Have you actually operated systems like this in production? Even reverting to a DB state that is only seconds old can still lose hundreds or thousands of transactions. Which means loads of unhappy customers. More realistically, recovery points are often minutes or hours behind once you factor in detection, validation and operational overhead.
DB revert is for exceptional disaster recovery scenarios, not something you want in normal day-to-day operations. If you are saying that you want to give LLM full access to prod DB and then revert every time it makes a mistake, you aren't running a serious business.
But are those users allowed to see all the data in the databawe by the law? Some privacy laws require that personal information must be hidden from employees unless they have a narrow and specific business reason to view it. Blanket full access to a database may be illegal for that reason.
I think a lot of the objections to your post could be answered by reminding folks of how Microsoft Access databases tend to pop up in small businesses as well as corporate environments outside of IT departments. Yes, they're not "proper" databases but they /get business done/ and often serve as v0 before a real app can be properly conceived of.
One can easily imagine an LLM-enabled database that lets a wider audience build meat-and-potatoes line-of-business apps for small team use with minimal compliance concerns.
If you're just vibe coding a tool for yourself, you don't have 'production database' at all even if you use database technology for storage. Just like many Android apps use local sqlite DBs but they're not production databases.
Of course in this case no traditional wisdom about production databases matters to you. In other words, it's off-topic.
I’m all-in on agents but this is a “you’re holding it wrong” situation.
If you want to give your agents a DB for their own work as a scratchpad or something that’s great. They can not only go to town, but also analyze their own work and iterate on it.
If you are talking about a production base, agents should not be hitting it directly under any circumstances. There needs to be an API layer with defined usage patterns, rate limits, etc.
This is basically the same as saying “databases weren’t designed for interns to run live inline migrations in prod”. Yeah of course they aren’t.
> This is basically the same as saying “databases weren’t designed for interns to run live inline migrations in prod”. Yeah of course they aren’t.
And the same as saying "databases weren't designed for non-technical people to connect with report-building tools like Power BI and Excel and run reports in the middle of peak customer checkouts."
As a DBA, I'm constantly surprised by what people think will be completely harmless to hook up to the database server - and then how much havoc it causes. Gonna be a rough decade.
>returning a 200 and a body containing "error: I didn't do what you said because _insert error here_"
I've seen this approach before, it mostly follows from using the code to signal application errors (200 + ok/error) from other kinds of errors that might arise.
HTTP error codes are divided between server (5xx) and client (4xx).
Where do these "application errors" occur if neither on a server nor a client?
I think the reality is that management sees "5xx means server error, so our team's KPI is now server error rate, the lower the better!" Then the team just stops using 500 errors as much as possible. They probably justify it with things like "well, such and such problem isn't our fault so its not really a server error." This kind of thinking is perverting the intent of 5xx messages. They are supposed to indicate any failure to handle the request that happens on the server, NOT measure whether the dev team is making a good application.
I would hope that you're running this on a replica so that the massive table scan doesn't choke writes to the main db. Even then it's possible to bring the replica down and depending on the technology still create a problem (WAL backup for instance)
Another way to bring prod down even with read is depending on your atomicity settings, try starting a transaction and don’t commit or abort it, just leave it dangling. That’s a cute one
> What if an executive makes a wrong business decision
I jokingly tell students, "We all know executives are gonna make bad decisions no matter what the data says. Might as well give them the random numbers more quickly."
This article has all the correct conclusions and solutions based on one assumption that doesn’t have any hold in reality - that someone would be insane enough to allow direct DB access to an AI agent.
I totally agree on investing in a sane data model upfront. So many production systems have schemas that only made sense to the engineer that created them. I would be delighted if I can read a schema and understand what a column means without having to dig through a bunch of migration PRs.
I recently encountered `is_as BOOL` in an important table. After way too much invested time we found out it meant "is active service". </DDL rant>
I integrate with many ERPs and this is the bane of my existence.
One of the worst has field names like `ft_0001...N` and table names like `UNCC_00001...N`, all in `text` fields (even numbers!), zero FK, almost no indexes and what are views?
The other has this funny field that is a blob that need decoding using a specific FreePascal version. The field? Where is the price of the product.
Other has, in the same column, mix of how handling "," or "." for numbers and
I need to check the digital places to deduce which.
FUN.
P.D: I normalize all this Erps into my own schema and has get praise for things like, my product table is called products.
I may have worked with that one. Did it have a parallel schema that mapped tables and fields with legible/customisable names, so every SQL call had to join the mapping tables to hit the required table and fetch the fields you were after?
Wrote a Windows .Net program once upon a time to convert the data from other financial CRM systems into the system I worked on. Built a data mapping tool as no customer we onboarded placed "custom" data in the same tables or fields even when using the same financial system.
I actually miss doing that kind of work, my brain seems to be wired to find it fun. Writing SQL is one thing I don't delegate to an AI or even an ORM like Doctrine.
I think the best db schema I had the displeasure of working with was one where it was a requirement that every table and column name NOT have vowels, except for the few that could, and "the few that could" were governed entirely by a spreadsheet owned by the DB admin.
And so you got tables like LANDMRK and columns like RCR_RCRDR.
There are two broad types of databases: operational and analytical.
Operational databases store transactions and support day-to-day application workflows.
For analysis, data is often copied into separate analytical databases (data warehouses), which are structured for efficient querying and large-scale data processing. These systems are designed to handle complex, random queries and heavy workloads.
LLM agents are the best way to analyze data stored in these databases. This is the future.
Based on my experience with Claude, it's pretty damn good at doing data analysis, if given the right curated data models. You still need to eyeball the generated SQL to make sure it makes sense.
> and how?
1. Replicate your Postgres into Snowflake/Databricks/ClickHouse/etc, or directly to Iceberg and hook it up to Snowflake/Databricks/ClickHouse/etc.
2. Give your agent read access to query it.
3. Build dimensional models (facts and dimensions tables) from the raw data. You can ask LLM for help here, Claude is pretty good at designing data models in my experience.
4. Start asking your agent questions about your data.
Keep steps 3-4 as a tight feedback loop. Every time your agent hallucinates or struggle to answer your questions, improve the model.
Side note: I'm currently building a platform that does all 3 (though you still need to do 2 yourself), you just need Postgres + 1 command to set it up: https://polynya.dev/
I'm not sure why you are giving your agents write access to query your OLTP database, let alone write to it. The pattern that I use at work is:
- Read access through OLAP, not OLTP. You just need to setup a near real-time replication between your OLTP and OLAP.
- Write access through API, just like your application. You can add fancy things like approval layer, e.g. you agent cannot "ban_user(id)", but it can "request_to_ban_user(id)", and the action only happens once you approve it.
I agree with some of reasonings in the article especially around agents write behaviors. However, a large portion of this is a consequence of poorly-defined agent roles, and moving away from building deterministic systems.
> An agent working on a customer analytics task might issue a join across five tables that has never been issued before, hold the connection while it thinks about the result, then issue a completely different follow-up.
Aren't agents simply automated human persona's? Like, why would an Agent make a join across five tables given that its reasoning is grounded on human-aligned decisions. For eg.
If the agent knows the schema, and is aware of the indexes defined. Then, shouldn't it only work its way through those 'known paths'. Why would you allow the agents to deviate known paths?
I'm exasperated whenever I read articles like this. Anyone who underscores the difference between humans and agents by saying "[agents] write based on their current understanding of the task, which may be wrong" is clearly working with a different species of human than the one I've worked with.
> None of this requires new technology. It requires treating the database as a defensive layer that assumes the caller might be wrong, might retry, and might not be watching the results.
This is one of those takes that is so close to understanding the problem, and then drawing an insane conclusion.
The problem is that AI agents and the code they output is untrustworthy, buggy, insecure, and lacking in any of the standards the industry has developed over the last 30 years. The solution to this is "don't use AI agents", not "change the rest of the stack to accommodate garbage".
How is this different from "people that cant write sql should not use orms"? With code agents you can write raw sql better than most developers; and if you want, you can basically ask for the same implementation using whatever orm you want. Lastly, AI generated code is supposed to be reviewed by a human, just like code done by your colleague. Thing is, with AI, you can establish automatic review guidelines, and even ask for proper benchmarks and optimizations, at zero cost.
The article describes idempotency keys and then completely misses making them the PK. The example is already using UUIDv4 as a PK, so they’re clearly not optimizing for performance. If you’re using the first 32 characters of a SHA256 hash, congratulations, store it as BINARY(8) / BYTEA - it’s even half the size of an encoded UUID, to boot.
Also, the DB will most certainly not silently ignore a unique constraint violation: it will send an error back. EDIT: unless you’re using INSERT OR IGNORE, of course.
The contract is still there, the humans taking decisions at some layer are still there. Decisions were made, risks were dismissed, and that won't protect production data if some of those risks happen. A database won't survive a manager that starts hitting it with an actual hammer neither, or an agent with enough privileges decide to delete or corrupt all the data. And adding a mitigation like i.e. soft deletes is another way of dismissing the risks.
(yes, the recommended way would be to simply grant only SELECT,INSERT,UPDATE but if I were the DBA here I would definitely put in place an explicit deny)
This doesn't make sense, in the context of the author's chosen example (postgres). Postgres connections are very heavy and there is a huge performance penalty for cycling them quickly, and a whole range of silly workarounds for this fact (pgbouncer). Maybe the author meant to say that sessions are brief.
I think the spirit of this post has merit, but the premise is flawed. ORMs have been causing this same class of problem for decades. Furthermore, It's not at all uncommon for humans to create different queries for the same result and for them to follow different review paths for the same underlying database.
A query created by a human and reviewed by at least 1 other human becomes static after it's merged. But the query from an LLM is dynamic, it can change between two calls in the same session if the LLm sees a reason to change it, and there is no review pipeline and QA stage.
Who the hell let agents directly use a database? Even humans don’t get this privilege. So, of all things, we forgot how to write APIs now? The article suggests creating a role for the agent directly in the database. What is wrong with you people? The very title of the article defeats its own purpose. They are not designed for this so don’t let them be used like this, ffs.
Databricks introduced Lakebase exactly for this purpose. Lakebase supports branching that makes querying, schema evolution and writes cheap. Each agent can run its own branch without affecting production data.
think of a flat database table as a projection (of all previous SQL queries), should you give access to it to agents?
probably not, maybe only for analytical (OLAP) purposes in read-only mode.
for transactional OLTP loads, it is better to use Kafka style durable queues, have agents create a change record to mutate the state, but not the projection itself, which could be recomputed at arbitrary point in time via time-travel mechanism, could be branched out into different versions, etc
> The contract goes something like this: the caller is a human-authored application, running deterministic code, issuing predictable queries, reviewed by a developer before deployment. Writes are intentional. Connections are brief. When something goes wrong, a human notices. The database can be dumb and fast because the application layer is smart and careful.
This assumption is that of a non-DBA who happened to get a hold of a database.
When you have sufficient users, your expected set of queries is a complete shit show. Some will be efficient, many will be poorly tested and psychotic, and indistinguishable from a non-deterministic LLM.
Also who said humans can’t query the database directly in prod? If not direct sql access, business users have the next best thing with custom reports and such. And they will very much ask for ridiculous amounts of computation to answer trivial questions.
It was a foundational assumption of SQL that business users would directly access the database and write their own queries.
It’s why row level access and permissions exist. Use them
> The database can be dumb and fast because the application layer is smart and careful.
I’ve always baked important invariants directly into the database with constraints and triggers. Maybe this is because I work on internal apps, where the data is more important than the presentation. Maybe it’s from my functional programming experience and some need to make invalid states unrepresentable.
Regardless, I believe that the data layer should be the most carefully designed part of an app.
It's interesting how many people are just writing off the whole premise of letting agents work directly with databases. SQL is a language and not an API because it was intended to be hand written and often still is. There are lots of legitimate use cases for this if the safety level can be raised high enough.
I'm doing some research on this topic at the moment, along with many other people on the database team at Oracle. The article is written from a Postgres perspective. Most of these problems don't surface with a better database engine and you don't need to do any work to solve them. For people also interested in this topic, here's some notes on how things line up:
• Statement timeouts: an indirect hack. What you actually want are resource usage limits like CPU wall clock time excluding lock waits. That's "CREATE PROFILE" in Oracle, or if you want more power (e.g. RAM / IOP limits too) there's https://docs.oracle.com/en/database/oracle/oracle-database/2...
• Writes: This boils down to the user might make a mistake and yes, sure, no database can magically stop all mistakes. The usual fix here is to define stored procedures that provide a set of safe write operations. Same in all DBs.
• Soft deletes: You can turn on Flashback (time travelling SQL) and then just use regular DELETEs. You can also undo transactions in some cases in Oracle even if other transactions happened afterwards as the DB can generate SQL for you that attempts to undo the effects.
• Assumption that connections are brief: Oracle has built in server-side DB pooling and horizontal scalability, so separate 'bouncers' aren't needed. You can just let agents connect directly without needing special infrastructure.
• Observability: you can associate metadata with connections that are then associated with recorded queries so agent/step can be looked up given a query. You can then find these queries if they're holding locks.
• Schema is a contract with eng: Well named schemas with comments are a good practice in any DB. Oracle views are automatically writeable in many cases (i.e. the SELECT query is inverted), so if your schemas are messy you can use views to clean them up and those views are still usable to make changes to the data if you need to.
• Scoping blast radius: Lots of security features in the DB to do this, as well as things not typically considered security like transaction priority levels to support lock breaking (run agents at low priority and app server writes will kill agent transactions). And you can easily make DB CoW snapshots.
So once your DB handles all the basics like that, the interesting things remaining are really all the semantics that are encoded into the application layer that the database doesn't see.
Pointing agents (or humans) at a live database to write arbitrary SQL might sound scary but it's the exact use case databases have been designed for from the beginning, and at least some of them have lots of features designed to make this as safe as possible. It can be that even more features are needed - that's the sort of question I'm currently exploring. But the foundation is there, at least.
Giving LLM agents direct, autonomous access to a real production databases with write access seems insane to me.
NO ONE, agent or human, should have direct write access to production databases outside of emergency break glass scenarios. This is why we have stored routines and API layers to pre-define what writes are allowed. The facts that agents CAN autonomously write to a database does not imply that they should.
For the point about query optimization, again your agents should not be issuing random queries against a production database. We have had the concept of separate analytics databases with different architectures to support exporatory queries for decades.
I agree and hope this is the case for anything serious enough. I also don't see this changing any time soon.
There are ways to give safe access to the data, at least read-only, that don't involve production risk and don't sacrifice privacy. For example, database branches with anonymization. Instead of accessing the prod/staging db, the agent creates a branch and has read/write access to that.
(disclaimer: I work at Xata, where we offer copy-on-write branches for Postgres, and the agent use-cases are the most popular right now)
I totally i agree! even read access specialty when databases has some sensitive/personal information about users.
I’m a DBRE. I spend a good portion of my day with a shell into one or more prod databases. The schema definitions in code are scattered between ORM model definitions, Alembic migrations, and Liquibase migrations, so the only reliable way I have of understanding a schema as it exists is to view it. Plus, I am very comfortable with SQL, and the various system catalogs of both MySQL and Postgres, so it’s a ton easier to work with.
Truly sensitive customer information is encrypted, and on an isolated DB cluster that no one has regular access to. I also operate with a read-only grant, because manual writes to a prod DB is generally a terrible idea.
So what do you do for "okay, we need to run this script that we've decided is a necessary operation". Special account? Everything go through the build server? I've been looking for tooling for "I need to do a production operation but I want it to have proper interlocks and reviews".
3 replies →
> autonomous access to a real production databases
remember that filesystems are just sophisticated databases
And the same logic applies.
How does that even work in compliance-relevant scenarios where the audit trail shows some LLM messed with the data? Who, if anyone, is on the hook?
My guess is that if the database is subject to auditing then LLM access (obviously writes in particular, but even reads come with exfiltration risks) will be a hard "no" and instant red flag. When it's a person, there is a sense of accountability and opportunity for remediation.
I suppose that LLMs will be treated as a code artifact and liability will shift upstream towards who deployed/approved the access in the first place. Even though code is essentially deterministic, making that association fairly simple, it's going to boil down to this same paradigm.
Perhaps governance rules will evolve to even explicitly forbid it, but my gut feeling is that for what the future determines to be "practical" reasons (right or wrong) LLMs will warrant an entirely new set of rules to allow them to be in the chain at all.
+ EDIT: both my wife and I have experience in this area and the current answer is companies like KPMG don't have an answer yet. Existing rules do help (e.g. there better be good documented reasons why it was used and that access was appropriately scoped, etc), but there is enough ambiguity around these tools so they say "stay tuned, and take caution".
The dev who ran it. The manager who allowed it. The director/VP/CTO who enabled the culture. They all have some responsibility for it.
2 replies →
Whoever provided the authorization credentials to the agent is on the hook.
[dead]
LLM agents are unlocking demand and supply for applications that wouldn't have been possible before due to time constraints though. There's a growing demand for single user or smaller scoped apps where giving LLM agents direct access means velocity. The failure/rollback model is much easier with these as long as we have good backup hygiene.
This makes no sense whatsoever.
It's not news that if you just give all developers at a company write access to the production databases, owner permissions on all resources, etc. that velocity can be increased. But at what cost?
The reason we don't do that in most cases is that "move fast and break things" only makes sense for trivial, non-critical applications that don't have any real importance, like Facebook.
3 replies →
> There's a growing demand for single user or smaller scoped apps where giving LLM agents direct access means velocity. The failure/rollback model is much easier with these as long as we have good backup hygiene.
This makes no sense to me. For anything that has sensitive payment or personally identifieable data, direct access to DB is potentially illegal.
> The failure/rollback model is much easier with these as long as we have good backup hygiene.
Have you actually operated systems like this in production? Even reverting to a DB state that is only seconds old can still lose hundreds or thousands of transactions. Which means loads of unhappy customers. More realistically, recovery points are often minutes or hours behind once you factor in detection, validation and operational overhead.
DB revert is for exceptional disaster recovery scenarios, not something you want in normal day-to-day operations. If you are saying that you want to give LLM full access to prod DB and then revert every time it makes a mistake, you aren't running a serious business.
13 replies →
But are those users allowed to see all the data in the databawe by the law? Some privacy laws require that personal information must be hidden from employees unless they have a narrow and specific business reason to view it. Blanket full access to a database may be illegal for that reason.
I think a lot of the objections to your post could be answered by reminding folks of how Microsoft Access databases tend to pop up in small businesses as well as corporate environments outside of IT departments. Yes, they're not "proper" databases but they /get business done/ and often serve as v0 before a real app can be properly conceived of.
One can easily imagine an LLM-enabled database that lets a wider audience build meat-and-potatoes line-of-business apps for small team use with minimal compliance concerns.
1 reply →
> single user
If you're just vibe coding a tool for yourself, you don't have 'production database' at all even if you use database technology for storage. Just like many Android apps use local sqlite DBs but they're not production databases.
Of course in this case no traditional wisdom about production databases matters to you. In other words, it's off-topic.
1 reply →
1) Can you explain what demand and supply mean in this context?
2) In regards to having good backup hygiene, who is we?
1 reply →
I’m all-in on agents but this is a “you’re holding it wrong” situation.
If you want to give your agents a DB for their own work as a scratchpad or something that’s great. They can not only go to town, but also analyze their own work and iterate on it.
If you are talking about a production base, agents should not be hitting it directly under any circumstances. There needs to be an API layer with defined usage patterns, rate limits, etc.
This is basically the same as saying “databases weren’t designed for interns to run live inline migrations in prod”. Yeah of course they aren’t.
> This is basically the same as saying “databases weren’t designed for interns to run live inline migrations in prod”. Yeah of course they aren’t.
And the same as saying "databases weren't designed for non-technical people to connect with report-building tools like Power BI and Excel and run reports in the middle of peak customer checkouts."
As a DBA, I'm constantly surprised by what people think will be completely harmless to hook up to the database server - and then how much havoc it causes. Gonna be a rough decade.
> The API failed silently because the database connection pool was exhausted downstream.
I work with a team that does stuff like this, returning a 200 and a body containing "error: I didn't do what you said because _insert error here_"
The problem is that you returned OK instead of ERROR when things were not OK and there was an ERROR.
Its a design that smells of teams trying to hit some kind of internal metrics by slightly deceptive means.
I had to explain so many times to infrastructure guys why it was not okay that the software they use to manage outages still returns 200s.
>returning a 200 and a body containing "error: I didn't do what you said because _insert error here_"
I've seen this approach before, it mostly follows from using the code to signal application errors (200 + ok/error) from other kinds of errors that might arise.
HTTP error codes are divided between server (5xx) and client (4xx).
Where do these "application errors" occur if neither on a server nor a client?
I think the reality is that management sees "5xx means server error, so our team's KPI is now server error rate, the lower the better!" Then the team just stops using 500 errors as much as possible. They probably justify it with things like "well, such and such problem isn't our fault so its not really a server error." This kind of thinking is perverting the intent of 5xx messages. They are supposed to indicate any failure to handle the request that happens on the server, NOT measure whether the dev team is making a good application.
1 reply →
Giving an LLM write access is insane but I gave LLM’s read-only access to our database and it’s been a huge productivity win.
Executives who wouldn’t take the time to build a report are happy to ask an AI agent to do so.
I would hope that you're running this on a replica so that the massive table scan doesn't choke writes to the main db. Even then it's possible to bring the replica down and depending on the technology still create a problem (WAL backup for instance)
Another way to bring prod down even with read is depending on your atomicity settings, try starting a transaction and don’t commit or abort it, just leave it dangling. That’s a cute one
How do you validate that the reports are correct? What if an executive makes a wrong business decision because the LLM wrote a wrong SQL query?
https://thedailywtf.com/articles/The-Great-Excel-Spreadsheet
> What if an executive makes a wrong business decision
I jokingly tell students, "We all know executives are gonna make bad decisions no matter what the data says. Might as well give them the random numbers more quickly."
The same way we've always done it - glance at it and see if the numbers look like they're within an order of magnitude of what looks reasonable.
5 replies →
How do you prevent your customer data being used for training?
The same way everyone does, by not using free LLMs, but instead paying OpenAI/Microsoft/Anthropic for an enterprise subscription?
1 reply →
This article has all the correct conclusions and solutions based on one assumption that doesn’t have any hold in reality - that someone would be insane enough to allow direct DB access to an AI agent.
someone is potentially insane enough it would seem: https://news.ycombinator.com/item?id=47911579
I totally agree on investing in a sane data model upfront. So many production systems have schemas that only made sense to the engineer that created them. I would be delighted if I can read a schema and understand what a column means without having to dig through a bunch of migration PRs.
I recently encountered `is_as BOOL` in an important table. After way too much invested time we found out it meant "is active service". </DDL rant>
I integrate with many ERPs and this is the bane of my existence.
One of the worst has field names like `ft_0001...N` and table names like `UNCC_00001...N`, all in `text` fields (even numbers!), zero FK, almost no indexes and what are views?
The other has this funny field that is a blob that need decoding using a specific FreePascal version. The field? Where is the price of the product.
Other has, in the same column, mix of how handling "," or "." for numbers and I need to check the digital places to deduce which.
FUN.
P.D: I normalize all this Erps into my own schema and has get praise for things like, my product table is called products.
I may have worked with that one. Did it have a parallel schema that mapped tables and fields with legible/customisable names, so every SQL call had to join the mapping tables to hit the required table and fetch the fields you were after?
Wrote a Windows .Net program once upon a time to convert the data from other financial CRM systems into the system I worked on. Built a data mapping tool as no customer we onboarded placed "custom" data in the same tables or fields even when using the same financial system.
I actually miss doing that kind of work, my brain seems to be wired to find it fun. Writing SQL is one thing I don't delegate to an AI or even an ORM like Doctrine.
1 reply →
I think the best db schema I had the displeasure of working with was one where it was a requirement that every table and column name NOT have vowels, except for the few that could, and "the few that could" were governed entirely by a spreadsheet owned by the DB admin.
And so you got tables like LANDMRK and columns like RCR_RCRDR.
Oh my. What could possibly be the justification for this?
2 replies →
Postgres has COMMENT ON to help with this but descriptive names are helpful.
> I recently encountered `is_as BOOL` in an important table.
Sounds like a table designed by Forrest Gump.
There are two broad types of databases: operational and analytical.
Operational databases store transactions and support day-to-day application workflows.
For analysis, data is often copied into separate analytical databases (data warehouses), which are structured for efficient querying and large-scale data processing. These systems are designed to handle complex, random queries and heavy workloads.
LLM agents are the best way to analyze data stored in these databases. This is the future.
> LLM agents are the best way to analyze data stored in these databases
Why, and how?
> Why
Based on my experience with Claude, it's pretty damn good at doing data analysis, if given the right curated data models. You still need to eyeball the generated SQL to make sure it makes sense.
> and how?
1. Replicate your Postgres into Snowflake/Databricks/ClickHouse/etc, or directly to Iceberg and hook it up to Snowflake/Databricks/ClickHouse/etc.
2. Give your agent read access to query it.
3. Build dimensional models (facts and dimensions tables) from the raw data. You can ask LLM for help here, Claude is pretty good at designing data models in my experience.
4. Start asking your agent questions about your data.
Keep steps 3-4 as a tight feedback loop. Every time your agent hallucinates or struggle to answer your questions, improve the model.
Side note: I'm currently building a platform that does all 3 (though you still need to do 2 yourself), you just need Postgres + 1 command to set it up: https://polynya.dev/
3 replies →
Wide tables and rich data. Dozens to hundreds of columns, some of them a json dimension. Way easier to explore these datasets with AI
I'm not sure why you are giving your agents write access to query your OLTP database, let alone write to it. The pattern that I use at work is:
- Read access through OLAP, not OLTP. You just need to setup a near real-time replication between your OLTP and OLAP.
- Write access through API, just like your application. You can add fancy things like approval layer, e.g. you agent cannot "ban_user(id)", but it can "request_to_ban_user(id)", and the action only happens once you approve it.
I don't understand the premise. Who is letting "agents" run arbitrary SQL against their database without human review?
Before redesigning your database, consider seeing a psychiatrist.
I agree with some of reasonings in the article especially around agents write behaviors. However, a large portion of this is a consequence of poorly-defined agent roles, and moving away from building deterministic systems.
> An agent working on a customer analytics task might issue a join across five tables that has never been issued before, hold the connection while it thinks about the result, then issue a completely different follow-up.
Aren't agents simply automated human persona's? Like, why would an Agent make a join across five tables given that its reasoning is grounded on human-aligned decisions. For eg.
If the agent knows the schema, and is aware of the indexes defined. Then, shouldn't it only work its way through those 'known paths'. Why would you allow the agents to deviate known paths?
I'm exasperated whenever I read articles like this. Anyone who underscores the difference between humans and agents by saying "[agents] write based on their current understanding of the task, which may be wrong" is clearly working with a different species of human than the one I've worked with.
> None of this requires new technology. It requires treating the database as a defensive layer that assumes the caller might be wrong, might retry, and might not be watching the results.
This is one of those takes that is so close to understanding the problem, and then drawing an insane conclusion.
The problem is that AI agents and the code they output is untrustworthy, buggy, insecure, and lacking in any of the standards the industry has developed over the last 30 years. The solution to this is "don't use AI agents", not "change the rest of the stack to accommodate garbage".
How is this different from "people that cant write sql should not use orms"? With code agents you can write raw sql better than most developers; and if you want, you can basically ask for the same implementation using whatever orm you want. Lastly, AI generated code is supposed to be reviewed by a human, just like code done by your colleague. Thing is, with AI, you can establish automatic review guidelines, and even ask for proper benchmarks and optimizations, at zero cost.
The article describes idempotency keys and then completely misses making them the PK. The example is already using UUIDv4 as a PK, so they’re clearly not optimizing for performance. If you’re using the first 32 characters of a SHA256 hash, congratulations, store it as BINARY(8) / BYTEA - it’s even half the size of an encoded UUID, to boot.
Also, the DB will most certainly not silently ignore a unique constraint violation: it will send an error back. EDIT: unless you’re using INSERT OR IGNORE, of course.
The contract is still there, the humans taking decisions at some layer are still there. Decisions were made, risks were dismissed, and that won't protect production data if some of those risks happen. A database won't survive a manager that starts hitting it with an actual hammer neither, or an agent with enough privileges decide to delete or corrupt all the data. And adding a mitigation like i.e. soft deletes is another way of dismissing the risks.
> Never let an agent hard-delete anything. Use soft deletes as a baseline for any table an agent can write to
How you even enforce it ?
And why you are even giving agent access to live DB in the first place ?
> How you even enforce it ?
(yes, the recommended way would be to simply grant only SELECT,INSERT,UPDATE but if I were the DBA here I would definitely put in place an explicit deny)
> Connections are Brief
This doesn't make sense, in the context of the author's chosen example (postgres). Postgres connections are very heavy and there is a huge performance penalty for cycling them quickly, and a whole range of silly workarounds for this fact (pgbouncer). Maybe the author meant to say that sessions are brief.
I think the spirit of this post has merit, but the premise is flawed. ORMs have been causing this same class of problem for decades. Furthermore, It's not at all uncommon for humans to create different queries for the same result and for them to follow different review paths for the same underlying database.
A query created by a human and reviewed by at least 1 other human becomes static after it's merged. But the query from an LLM is dynamic, it can change between two calls in the same session if the LLm sees a reason to change it, and there is no review pipeline and QA stage.
Who the hell let agents directly use a database? Even humans don’t get this privilege. So, of all things, we forgot how to write APIs now? The article suggests creating a role for the agent directly in the database. What is wrong with you people? The very title of the article defeats its own purpose. They are not designed for this so don’t let them be used like this, ffs.
Uhm, everyone I know? To let it write to a production database, that’s on another level though.
Databricks introduced Lakebase exactly for this purpose. Lakebase supports branching that makes querying, schema evolution and writes cheap. Each agent can run its own branch without affecting production data.
think of a flat database table as a projection (of all previous SQL queries), should you give access to it to agents?
probably not, maybe only for analytical (OLAP) purposes in read-only mode.
for transactional OLTP loads, it is better to use Kafka style durable queues, have agents create a change record to mutate the state, but not the projection itself, which could be recomputed at arbitrary point in time via time-travel mechanism, could be branched out into different versions, etc
> The contract goes something like this: the caller is a human-authored application, running deterministic code, issuing predictable queries, reviewed by a developer before deployment. Writes are intentional. Connections are brief. When something goes wrong, a human notices. The database can be dumb and fast because the application layer is smart and careful.
This assumption is that of a non-DBA who happened to get a hold of a database.
When you have sufficient users, your expected set of queries is a complete shit show. Some will be efficient, many will be poorly tested and psychotic, and indistinguishable from a non-deterministic LLM.
Also who said humans can’t query the database directly in prod? If not direct sql access, business users have the next best thing with custom reports and such. And they will very much ask for ridiculous amounts of computation to answer trivial questions.
It was a foundational assumption of SQL that business users would directly access the database and write their own queries.
It’s why row level access and permissions exist. Use them
> The database can be dumb and fast because the application layer is smart and careful.
I’ve always baked important invariants directly into the database with constraints and triggers. Maybe this is because I work on internal apps, where the data is more important than the presentation. Maybe it’s from my functional programming experience and some need to make invalid states unrepresentable.
Regardless, I believe that the data layer should be the most carefully designed part of an app.
At one place, the saying was that databases can handle everything except a class of sophomores learning how to use databases.
i wonder if that guy get requests per email from his "talks" section
Why are you connecting your agent to a database with write access? Are you out of your mind.
Agree across the board.
It's interesting how many people are just writing off the whole premise of letting agents work directly with databases. SQL is a language and not an API because it was intended to be hand written and often still is. There are lots of legitimate use cases for this if the safety level can be raised high enough.
I'm doing some research on this topic at the moment, along with many other people on the database team at Oracle. The article is written from a Postgres perspective. Most of these problems don't surface with a better database engine and you don't need to do any work to solve them. For people also interested in this topic, here's some notes on how things line up:
• Statement timeouts: an indirect hack. What you actually want are resource usage limits like CPU wall clock time excluding lock waits. That's "CREATE PROFILE" in Oracle, or if you want more power (e.g. RAM / IOP limits too) there's https://docs.oracle.com/en/database/oracle/oracle-database/2...
• Writes: This boils down to the user might make a mistake and yes, sure, no database can magically stop all mistakes. The usual fix here is to define stored procedures that provide a set of safe write operations. Same in all DBs.
• Soft deletes: You can turn on Flashback (time travelling SQL) and then just use regular DELETEs. You can also undo transactions in some cases in Oracle even if other transactions happened afterwards as the DB can generate SQL for you that attempts to undo the effects.
• Idempotency keys: Built in if you turn on Transaction Guard. https://docs.oracle.com/en/database/oracle/oracle-database/2...
• Assumption that connections are brief: Oracle has built in server-side DB pooling and horizontal scalability, so separate 'bouncers' aren't needed. You can just let agents connect directly without needing special infrastructure.
• Observability: you can associate metadata with connections that are then associated with recorded queries so agent/step can be looked up given a query. You can then find these queries if they're holding locks.
• Schema is a contract with eng: Well named schemas with comments are a good practice in any DB. Oracle views are automatically writeable in many cases (i.e. the SELECT query is inverted), so if your schemas are messy you can use views to clean them up and those views are still usable to make changes to the data if you need to.
• Scoping blast radius: Lots of security features in the DB to do this, as well as things not typically considered security like transaction priority levels to support lock breaking (run agents at low priority and app server writes will kill agent transactions). And you can easily make DB CoW snapshots.
So once your DB handles all the basics like that, the interesting things remaining are really all the semantics that are encoded into the application layer that the database doesn't see.
Pointing agents (or humans) at a live database to write arbitrary SQL might sound scary but it's the exact use case databases have been designed for from the beginning, and at least some of them have lots of features designed to make this as safe as possible. It can be that even more features are needed - that's the sort of question I'm currently exploring. But the foundation is there, at least.
Jesus Christ. All this is true only if you let them.
It’s good idea to be defensive, design the system in a way that it can “fix” itself.
But for love of god, don’t let an LLM do everything it wants.
[dead]
[dead]
[dead]