Comment by freetime2
2 days ago
Anecdotally, the worst codebase I ever worked on made heavy use of stored procedures. Over the years people couldn’t be bothered or were afraid to update them - which really was the root of the problem. This led to all kinds of crazy patterns in the application code like calling things in a loop where a bulk operation was needed. Or stringing together several stored procedure calls to get the desired outcome, when really a refactor was need to combine or decompose them as needs evolved.
If you’re hiring devs for their experience with C# (which the app was written in) then you probably want to have the bulk of your business logic written in C#. Even the DBAs at my current company tend to advocate for using stored procedures only as a last resort for this reason.
That team eventually decided they needed to rewrite the whole app using a NoSQL database. I didn’t stick around long enough to see how that decision turned out.
I am not going to dismiss your experience here. Stored Procedures can turn into wild monsters. Pair it with Triggers and you are in for chasing a noodle.
But it's also a reality, that relational databases often become the main integration point in companies. In those environments it’s hard (next to impossible) and dangerous to use something like ORMs.
Often enough I don't "own the tables" and I don't "own the columns" for lack of a better word. The DBA only gives me SELECT and EXECUTE permissions on the database. How am I going to work here without Stored Procedures?
And while this sounds weird, these limited permissions are to protect me from accidentally writing to wrong columns. Wrong columns that could impact systems, I didn't even know about. Is it possible to write to the same columns with a Stored Procedure? Of course! But it's not as dangerous as giving an application fat UPDATE permissions.
By using SQL Views I can build a data model upon these tables for reading the data, and build a more consumable data model. And you mentioned C#: I can use EF Core to query these Views with LINQ. Stored Procedures are used to store data to multiple tables in a "legacy database" within a transaction.
This could also be done with EF Core, but I need to explicitly log all changes to the data. How should this be done without a Stored Procedure or Triggers? CDC doesn't help a lot here.
That pattern, databases being the main integration point in companies, is returning with data lakes like snowflake, databricks, and ducklake (poor man’s snowflake). Where better to get your integration data than in a unified, quality controlled, central location. No need to call multiple services, no need to unify different data models.
How many DBAs have you dealt with that only give you SELECT and EXECUTE? That seems somewhat crazy. It doesn't cut down the surface area for bugs at all (the same amount of SQL is still required and it'd be the same code as the app was going to execute anyway as a normal query). What scenarios are they worried about here?
We only give select and execute permissions. This has saved our asses a lot of times. Do not underestimate the damage a development team with a high turnover can do.
I have seen many actually.
Imagine having a huge amount of applications all using the same database, which is, like I said, often enough the main integration point in a company. The applications you are maintaining are written in, let’s say Java, C#, Python and C++. They are too important, too large and a rewrite is way too risky.
How would you start to get this under your control? By moving common logic in a shared C++ DLL probably? Rewrite all those applications at once, in a heroic effort? I for one take the pragmatic approach and extract common business logic shared between all them into a Stored Procedure.
The Stored Procedures are invoked by all applications, so instead of maintaining 10 implementations, I maintain 1. Suddenly you have a common data access path! The 1 implementation will be drowned in a shitload integration tests, because this is what really matters.
Of course, a Stored Procedure doesn’t prevent you from shooting yourself in the foot! But in my experience, it’s way harder to shoot yourself in the foot using a Stored Procedure, that you can inspect the Query Planner for… than hunting down Monster-Queries for weeks, generated by a LINQ provider.
As for the INSERT, UPDATE and DELETE: With SPs, you’ll have fine-grained control, which columns are updated instead of needing to expose all properties in your application code and praying for an ORM to not accidentally update “columns I don’t own”, because of some convention applied by the ORM.
If you are in the lucky position of owning the database and don’t have to maintain legacy applications, I see resistance to Stored Procedures. Then go ahead and build an API all applications are going to use.
But if you don’t own the database and need to maintain dozens of mission-critical applications, Stored Procedures are a tool worth knowing.
1 reply →
Amending data obviously. Building a dashboard using payroll data and having the power to change it are two wildly different things, and no sane large corporation would allow a dashboard's user account to change that data .
You're coming at this from the idea that one user has full access to everything when that is likely the biggest security lapse you'll find in modern apps that use databases: a Web app that has full ddl and even dml access when it shouldn't have.
2 replies →
> you probably want to have the bulk of your business logic written in C#
Perhaps, if the C# applications are the only ones accessing the database. But suppose multiple applications written in different languages (and for different purposes) need the database access? You can do this via stored procs or (better in my experience) by adding an intermediate server process which the applications use, via a publicly documented API, and which then talks to the the database which may provide stored proc access to the intermediate server.
Equally anecdotally:
An example is a server me and a couple of colleagues worked on at a big investment bank. The underlying databases were mostly proprietary from suppliers or written by other teams in the bank. And versioned (badly, eek!). We wrote the SQL queries/procs to access the data that the VB (now C# I guess) would not have been capable of (without enormous waste of effort), and then provided COM interfaces to our server that ran our queries. Mercifully, we didn't have to provide updates, but the same idea would have applied. This gives you more or less complete isolation between app and database.
Anyway, SPs can certainly fit into such an architecture, but might not be first (or second) choice.
> You can do this via stored procs or (better in my experience) by adding an intermediate server process which the applications use, via a publicly documented API, and which then talks to the the database which may provide stored proc access to the intermediate server
I don't see how that is materially different from what the OP said, you're both advocating for a single server in front of the database. We can call it intermediate or not, but the only difference is whether the API is public, so it makes sense to keep business logic in the app code.
> Perhaps, if the C# applications are the only ones accessing the database. But suppose multiple applications written in different languages (and for different purposes) need the database access?
Without wishing this to sound like a personal attack, YUCK
A database that's accessed by multiple applications, regardless of the number of languages, is a really bad smell.
If I have a user db, and hundreds of other applications want to make use of the data in it, they come through MY application. I own the db, I decide what's in it, how the data is shaped, and when it changes.
Multiple applications accessing a single database is asking for a bureaucratic nightmare - "App A wants to change the first name field to UTF-8" - EVERY other user, of that database needs to be informed (best of luck figuring out what other apps are, and who owns them). If you are tempted to say "Those other apps should support UTF-8 as well" then that's the sound of someone that's never had legacy code in the environment.
It's not much different. If you have a bunch of API clients that don't agree on character encodings you'll have a tricky migration process too.
The important part is what changes it.
Read my files, fine. I always try to avoid breaking changes anyway. Just don't write them!!
> A database that's accessed by multiple applications, regardless of the number of languages, is a really bad smell.
Except that really was the original model back in the 90's. All "good" databases had an extensive roles and permissions system, and a point of pride was the number of ODBC connectors for different languages and environments.
You were supposed to have The Database, looked after and controlled by the hallowed DBAs, who had their own hardware budget, their own organization, and who controlled access to The Database, giving trusted applications access to it, but only after they had vetted the schema and queries that those dirty developers wanted to run against it. Trusted users could get SELECT access, but only to the tables they needed to run their custom reports and queries.
It was a whole ass thing that's fallen completely to the wayside as database software went from underpinning Larry's latest mega-yacht, to free as in beer, and we learned how to clone and shard the data instead.
Precisely why I said the applications should access the data via a server process - they never know the underlying database schema.
I'm convinced that the solution to stored procedures that people are afraid to update is automated tests. Write unit tests for them just like you would any other complex logic - the tests can evolve over time into a detailed spec for how they should work, and give you the ability to refactor or improve them with confidence later on.
For some reason stored procedures and automated testing seem to often not go together, maybe because they're often accompanied by an environment where DBAs and software developers are separate teams with separate sets of skills?
I recently took a couple 100 line stored procedures that I wrote years ago, and replaced them with equivalent go code and a few inline SQL calls. We had extensive tests on those stored procedures too, or rather, tests on the go methods that called them. The tests all ran in a few seconds. The go code feels so much nicer to deal with, and I thought about why:
- Stored procedures are deployed separately from the code that calls them: one is a db migration, one is a k8s container deploy. And that binding isn’t strongly-typed, so there’s more thinking and double checking involved during deployment (and, heaven forbid, rollback).
- The SQL procedures duplicated some business logic and constants that are defined in some of our go core libraries. It felt good to remove that duplication.
- The procedures also had a few specific return codes. We needed to define some constants (like postgres itself does) that the go code could detect and translate into go errors. That’s more complexity.
- Maybe there’s a good SQL IDE out there, but I don’t have it, so I was writing these things in vscode with syntax highlighting. Plenty of errors that would get an immediate underline in go instead required 20 seconds to compile and run the test and interpret the error message.
- This lack of tooling, and the fact that I’m not great at SQL (I’m fairly good with queries, but not with stuff like variables and IF EXISTS that you only see in stored procedures), made it hard to use all the complexity-cutting techniques I’m used to applying in go and other languages. So the 100 line stored procedure stayed 100 lines, and everyone hated working on it.
Stored procedures are basically a cross-service API call, with all the complexities that entails. I won’t reach for them again unless there’s a technical problem that can’t be solved any other way. We don’t have a dedicated DBA so Conway’s Law doesn’t factor in.
I think a lot of the problems come from the fact testing stored procedures ventures into e2e testing land. You have to stand up infra in order to test them. There's not really been a simple way to unit test stored procedures as part of your application codes testing framework.
(Aside: I think this is something PGlite helps with if your in Postgres land)
My team has found a lot of success using testcontainers for testing Go and Java applications integrating with Postgres. They feel more unit-testy than e2e-testy.
Admittedly I’m only talking about selects, inserts, updates, views, etc. not stored procedures. But having worked in codebases with far too many stored procedures in the past, I think there might be a marginal improvement.
For what it’s worth, I fully agree that the main problem with using store procedures is testability and debugability.
PostgreSQL server is a single process that starts in under 100ms on a developer's laptop.
In the company I work for we use real PostgreSQL in unit tests — it's cheap to start one at the beginning of a suite, load the schema and go, and then shut it down and discard its file store.
I keep thinking of moving that file store to tmpfs when run on Linux, but it's nowhere near the top of the performance improvements for the test suite.
So: no more mocks or subsitute databases with their tiny inconsistencies.
In good resource on testing stored procedures ?
There are unit-testing frameworks for the purpose. For SQL Server there are several, [0] I've worked briefly with one called tSQLt [1] and was quite impressed. Frameworks also exist for other RDBMSs. [0]
[0] https://en.wikipedia.org/wiki/List_of_unit_testing_framework...
[1] https://tsqlt.org/user-guide/quick-start/
I'd test them the same way I test other database stuff: run automated tests which spin up a database server for the duration, then run migrations to install the schema and stored procedures, then execute tests using your programming language of choice.
I still mostly use Django for PostgreSQL projects and the Django default test framework is great at running a temporary PostgreSQL (or MySQL or SQLite).
For Postgres, there’s pgTAP: https://pgtap.org/
Oh man. I've seen an org where the stored procedures were written by a different team, and friction was high.
Then one java dev found a glaring sql injection hole. The whole app quickly reorganized around using it as an API for everything. Management was pleased speed went up and bugs went down, DBAs were pleased for less work, and Java devs were pleased they could understand what was going on in the app. Everybody happy I guess?
I’d rewrite the complete team instead.