Comment by antics

1 month ago

> # Exactly-once execution

> DBOS has a special @DBOS.Transaction decorator. This runs the entire step inside a Postgres transaction. This guarantees exactly-once execution for databases transactional steps.

Totally awesome, great work, just a small note... IME a lot of (most?) pg deployments have synchronous replication turned off because it is very tricky to get it to perform well[1]. If you have it turned off, pg could journal the step, formally acknowledge it, and then (as I understand DBOS) totally lose that journal when the primary fails, causing you to re-run the step.

When I was on call for pg last, failover with some data loss happened to me twice. So it does happen. I think this is worth noting because if you plan for this to be a hard requirement, (unless I'm mistaken) you need to set up sync replication or you need to plan for this to possibly fail.

Lastly, note that the pg docs[1] have this to say about sync replication:

> Synchronous replication usually requires carefully planned and placed standby servers to ensure applications perform acceptably. Waiting doesn't utilize system resources, but transaction locks continue to be held until the transfer is confirmed. As a result, incautious use of synchronous replication will reduce performance for database applications because of increased response times and higher contention.

I see the DBOS author around here somewhere so if the state of the art for DBOS has changed please do let me know and I'll correct the comment.

[1] https://www.postgresql.org/docs/current/warm-standby.html#SY...

Yeah, that's totally fair--DBOS is totally built on Postgres, so it can't provide stronger durability guarantees than your Postgres does. If Postgres loses data, then DBOS can lose data too. There's no way around that if you're using Postgres for data storage, no matter how you architect the system.

  • That’s my intuition as well, but it does raise a question in my mind.

    We have storage solutions that are far more robust than the individual hard drives that they’re built upon.

    One example that comes to mind is Microsoft Exchange databases. Traditionally these were run on servers that had redundant storage (RAID), and at some point Microsoft said you could run it without RAID, and let their Database Availability Groups handle the redundancy.

    With Postgres that would look like, say, during an HTTP request, you write the change to multiple Postgres instances, before acknowledging the update to the requesting client.

    • Yes exactly! That's how Postgres synchronous replication works. If you turn on synchronous replication, then you won't lose data unless all your replicas are wiped out. The question the original poster was asking was what guarantees can be provided if you DON'T use synchronous replication--and the answer is that there's no such thing as a free lunch.