Comment by jiggawatts

2 days ago

Blogs like this make me go on the same rant for the n-th time:

Consistency for distributed systems is impossible without APIs returning cookies containing vector clocks.

The idea is simple: every database has a logical sequence number (LSN), which the replicas try to catch up to -- but may be a little bit behind. Every time an API talks to a set of databases (or their replicas) to produce a JSON response (or whatever), it ought to return the LSNs of each database that produced the query in a cookie. Something like "db1:591284;db2:10697438".

Client software must then union this with their existing cookie, and return the result of that to the next API call.

That way if they've just inserted some value into db1 and the read-after-write query ends up going to a read replica that's slightly behind the write master (LSN 591280 instead of 591284) then the replica can either wait until it sees LSN >= 591284, or it can proxy the query back to the write master. A simple "expected latency of waiting vs proxying" heuristic can be used for this decision.

That's (almost entirely) all you need for read-after-write transactional consistency at every layer, even through Redis caches and stateless APIs layers!

(OP here). I don’t love leaking this kind of thing through the API. I think that, for most client/server shaped systems at least, we can offer guarantees like linearizability to all clients with few hard real-world trade-offs. That does require a very careful approach to designing the database, and especially to read scale-out (as you say) but it’s real and doable.

By pushing things like read-scale-out into the core database, and away from replicas and caches, we get to have stronger client and application guarantees with less architectural complexity. A great combination.

FWIW, I think that’s essentially how Aurora DSQL works, and sort of explained at the end of the article.