← Back to context

Comment by AtlasBarfed

2 years ago

The Sequencer:

- does not have a persistent/disk-backed state

- It is a singleton process

- it and only it does order, no logs do ordering

... if the singleton sequencer crashes, I do not see on this high level description how the system recovers, if the sequencer is the only one that knows write order but has no persistent write "log".

What am I missing?

This... does not appear to be something you run outside of a dedicated datacenter, AWS with its awful networking and slow/silently throttling storage would probably muck this thing up under any substantive scale?

What you are missing is that the "tlogs" (transaction logs) actually hold the durable, fault tolerant write log. The sequencer is just a big fast in-memory data structure that checks if the many transactions coming into the system pass isolation checks (the I in ACID). That is, it accepts transaction so long as the keys that the transaction read haven't been modified in the mean time.

The reason it can fail without a correctness issue is that it can just reject all transactions in flight for the clients to retry. This is something the clients need to be prepared to do anyway because of optimistic concurrency.

It can run fine on AWS. Upon a failure, the sequencer role is very fast to re-elect onto another machine in the cluster because there is no persistent state at all.

It runs fine in AWS, Snowflake and many others run it there. The most recent FoundationDB paper goes into a lot more detail on their recovery protocol, it’s a lot more nuanced than you think, but it works extremely well