← Back to context

Comment by Pxtl

10 hours ago

As much as relational DBs have held back enterprise software for a very long time by being so conservative in their development, the fact that they force you to put this relationship absolutely front-of-mind is excellent.

I'd personally consider "persistence" AKA "how to store shit" to be a very different concern compared to the data structures that you use in the program. Ideally, your design shouldn't care about how things are stores, unless there is a particular concern for how fast things read/writes.

  • Often significant improvements to every aspect of a system that interacts with a database can be made by proper design of the primary keys, instead of the generic id way too many people jump to.

    The key difficulty is identifying what these are is far from obvious upfront, and so often an index appears adjacent to a table that represents what the table should have been in the first place.

    • I guess that might be true also, to some extent. I guess most of the times I've seen something "messy" in software design, it's almost always about domain code being made overly complicated compared to what it has to do, and almost never about "how does this domain data gets written/read to/from a database", although it's very common. Although of course storage/persistence isn't non-essential, just less common problem than the typical design/architecture spaghetti I encounter.

    • I'm a firm believer in always using an auto-generated surrogate key for the PK because domain PKs always eventually become a pain point. The problem is that doing so does real damage to the ergonomics of the DB.

      This is why I fundamentally find SQL too conservative and outdated. There are obvious patterns for cross-cutting concerns that would mitigate things like this but enterprise SQL products like Oracle and MS are awful at providing ways to do these reusable cross-cutting concerns consistently.

  • I meant to reply to a different comment originally, specifically the one including this quote from Torvalds:

    > Good programmers worry about data structures and their relationships.

    > -- Linus Torvalds

    I was specifically thinking about the "relationship" issues. The worst messes to fix are the ones where the programmer didn't consider how to relate the objects together - which relationships need to be direct PK bindings, which can be indirect, which things have to be cached vs calculated live, which things are the cache (vs the master copy), what the cardinality of each relationship is, which relationships are semantically ownerships vs peers, which data is part of the system itself vs configuration data vs live, how you handle changes to the data, (event sourcing vs changelogging vs vs append-only vs yolo update), etc.

    Not quite "data structures" I admit but absolutely thinking hard about the relationship between all the data you have.

    SQL doesn't frame all of these questions out for you but it's good getting you to start thinking about them in a way you might not otherwise.