Comment by eikenberry
7 hours ago
> The reason DBs like Mongo or Dynamo exist is because Postgres has a scaling problem.
I've used Postgres at a few places and the #1 problem was always high availability, not scaling. One Postgres cluster could easily handle 100000 transactions per minute, but when a primary node went down it was a page and manually failing over to the spare then manually replacing the spare. The manual tooling was very finicky but at least it worked, no automated solution came even close. Lack of a good HA story is why I avoid self-managed Postgres as much as possible.
Good thing we support HA as well: https://docs.pgdog.dev/features/load-balancer/
Load balancer with health checks and failover, works out of the box. :) Battle-tested at this point too, so could be worth a look.
I've extensively used Dynamo (internally at Amazon and externally) and even founded a DB startup with it at it's core. Boiling down scalability of Postgres vs Dynamo as it's written in blog is a bit terse. Dynamo scales writes horizontally with the keyspace, forever. Postgres simply can't, and no number of layers between the machines and the developer changes that. Sharding, pooling, Citus are all layered on top of an engine where a given row's writes still land on one primary.
Curious how the DB startup with Dynamo at its core went. We use it heavily. The primary tricky thing for us at the moment is aligning pricing with workload value.
2 replies →
Except that dynamo is still just glorified mysql? https://news.ycombinator.com/item?id=18871661
I don’t think the backend matters. It’s the frontend wrapper that makes or breaks HA.
Dynamo is a fundamentally different DB to Postgres. If your problem fits into the dynamo approach (I'd argue that more problems do), then you should be using it. No all problems fit, though.
1 reply →
That's great news! I'll bookmark this in case I'm forced to manage Postgres again.
What do you use instead?
Is a load balancer HA?
Not by itself if it's naive, but if it's able to assess target health and avoid degraded instances then it becomes a component in HA, the other being integrating an orchestrator for gracious recovery.
3 replies →
Combined with a replication strategy and automated health checks, a load balancer could direct traffic to a healthy instance automatically.
What happens when the load balancer fails?
1 reply →
Patroni 1.0 was released in 2016, i.e ~10 years ago.
https://github.com/patroni/patroni
Yup Patroni handles automatic failures and cluster management quite well
Patroni serves this niche pretty well at this point.
Have you looked into things like CloudnativePG? https://cloudnative-pg.io/
CNPG is quite nice and robust but I'd still be a bit reluctant to stack PG on k8s for really big clusters just because k8s ecosystem moves quite quickly and there's lots of patching/maintenance/churn which means more PG failovers so depends on how well your workload handles that (they're normally only a few seconds)
~1600 TPS is not 'high scale'.
Pretty good for 98% of projects though.