Comment by endisneigh
3 years ago
I do not disagree at all that what you are describing can happen. What I'm not understanding is why they're failing at multi year attempts to fix this.
Even in your scenario you could identify schemas and tables that can be separated and moved into a different database or at maturity into a more scalable NoSQL variety.
Generally once you get to the point that is being described that means you have a very strong sense on the of queries you are making. Once you have that it's not strictly necessary to even use a RDBMS, or at the very least, a single database server.
> Even in your scenario you could identify schemas and tables that can be separated and moved into a different database or at maturity into a more scalable NoSQL variety.
How? There's nothing tracking or reporting that (unless database management instrumentation has improved a lot recently), SQL queries aren't versioned or typechecked. Usually what happens is you move a table out and it seems fine, and then at the end of the month it turns out the billing job script was joining on that table and now your invoices aren't getting sent out.
> Generally once you get to the point that is being described that means you have a very strong sense on the of queries you are making.
No, just the opposite; you have zillions of queries being run from all over the case and no idea what they all are, because you've taught everyone that everything's in this one big database and they can just query for whatever it is they need.
Every database I know of can generate query logs. Why not just log every query and do some statistical analysis on it?
> Every database I know of can generate query logs.
Every one I know of warns that it comes with significant performance implications and isn't intended to be used in production.
> Why not just log every query and do some statistical analysis on it?
And then what? If you know this table only gets queried a few times a month, what does that actually tell you that you can use?
4 replies →