Comment by zamalek
2 years ago
Ever since coming across this idea I have wondered what a (likely relational) database which deeply embedded these concepts would look like.
2 years ago
Ever since coming across this idea I have wondered what a (likely relational) database which deeply embedded these concepts would look like.
You can actually get quite some way by adding constraints to columns. NOT NULL, default values and a CHECK constraints on a column can limit random invariance bleeding into the db.
So for example, if I ever see a CHAR(1) column that looks like it's meant to hold a boolean of some sort, the best thing is to immediately lock it down to a strict subset (preferably not NULL) of 2 states like ['Y'|'N'].
I still have nightmares about a db schema that grew a fungus of boolean representations, and massive amounts of code checking for y,n,Y,N,0,1,T,F,t,f all of which were used by different teams at different times. Constraints would have stopped all that.
Here's the docs for Postgres: https://www.postgresql.org/docs/current/ddl-constraints.html
As the sibling post says, constraints can cover a lot of this already. What I'm not sure about is how applications querying the language would interact with a database that provides a lot of useful invariants natively; if the application language's type system can't express those invariants, I think you'd end up with impedance mismatches that could cause a lot of frustration.
I've had some idle thoughts about creating a Postgres extension that would make some of these concepts easy to implement and work with. I modeled a sum type/discriminated union a while back, I was able to make it work with constraints, but it was fairly verbose.
Which part was verbose?
I feel such an extension will immediately find its audience.
Writing out the constraints. The starting point was an admin_notes table to start with, where each note initially had the same fields; we later added different categories of notes, where two of the categories had fields that were only meaningful for that category. What I did was add nullable columns for all of the category-specific fields to the table, with constraints to make the category-specific columns only have values for rows of the appropriate category. The type looked roughly like this (using Rust syntax) [1]:
Setting up the constraints took about 30 lines of SQL just for those couple of boolean fields on some of the categories, you can see the details here [2].
As far as making this a Postgres extension - I'm not sure how useful it would be when the application language doesn't have a notion of sum types. Thinking about it, what might be more useful would be a language-specific library for data validation/constraints that sets up the database constraints as well. I'm not sure, though.
[1] We weren't using Rust, this was in Go, but I figured this was the most succinct way to summarize it. Our GraphQL schema for this type was basically the equivalent of this.
[2] https://github.com/CMSgov/easi-app/blob/main/migrations/V164...
1 reply →