Comment by vivzkestrel

3 months ago

Why cant this be done with PostgreSQL?

The short answer is that we tried, back in 2020, while working on a central bank payment switch by the Gates Foundation. We found we were hitting the limits of Amdahl's Law, given Postgres' concurrency control with row locks held across the network as well as internal I/O, leading to the design of TigerBeetle. To specialize not for general purpose but only for transaction processing.

On the one hand, yes, you could use a general purpose string database to count/move integers, up to a certain scale. But a specialized integer database like TigerBeetle can take you further. It's the same reason, that yes, you could use Postgres as object storage or as queue, or you could use S3 and Kafka and get separation of concerns in your architecture.

I did a talk diving into all this recently, looking at the power law, OLTP contention, and how this interacts with Amdahl's Law and Postgres and TigerBeetle: https://www.youtube.com/watch?v=yKgfk8lTQuE

  • i am not an exact expert on the limitation you claim to have encountered on postgresql but perhaps someone with more postgresql expertise can chime in on this comment and give some insight

    • For updating a single resource where the order of updates matters the best throughput one can hope for is the inverse of locking duration. Typical postgres using applications follow the pattern where a transaction involves multiple round trips between the application and the database to make decisions in the code running on the application server.

      But this pattern is not required by PostgreSQL, it's possible to run arbitrarily complex transactions all on server side using more complex query patterns and/or stored procedures. In this case the locking time will be mainly determined by time-to-durability. Which, depending on infrastructure specifics, might be one or two orders of magnitude faster. Or in case of fast networks and slow disks, it might not have a huge effect.

      One can also use batching in PostgreSQL to update the resource multiple times for each durability cycle. This will require some extra care from application writer to avoid getting totally bogged down by deadlocks/serializability conflicts.

      What will absolutely kill you on PostgreSQL is high contention and repeatable read and higher isolation levels. PostgreSQL handles update conflicts with optimistic concurrency control, and high contention totally invalidates all of that optimism. So you need to be clever enough to achieve necessary correctness guarantees with read committed and the funky semantics it has for update visibility. Or use some external locking to get rid of contention in the database. The option for pessimistic locking would be very helpful for these workloads.

      What would also help is a different kind of optimism, that would remove durability requirement from lock hold time, which would then result in readers having to wait for durability. Postgres can do tens of thousands of contended updates per second with this model. See the Eventual Durability paper for details.