Comment by sambigeara

1 day ago

So, the moment a partition occurs, nodes within their resultant partitions then view the remaining peers as the full view of the world. There is _no_ concept of a split brain scenario.

ANY decision around network topography or workload placement is a deterministic calculation run by all nodes individually. If all nodes see the same sub-set of peers representing their entire "cluster", they'll all naturally converge on the same view of what the cluster should look like. If the calculated output determines that Node A should claim Seed B, and it doesn't have it, it requests it from a peer who has it.

As soon as the partition recovers, nodes see the additional nodes re-enter the candidate set, which is then added in to future routing and placement decisions.

The main tradeoff to understand here is that you're at mercy of the random (best attempt redundant) placement of a seed. If the entire cluster has, say, 2 replicas stored on any given nodes, if a resultant partition doesn't happen to have either of those two nodes, then the seed will be unavailable until the partition recovers. You can work around this with "smart" initial placements (one near, one close, for example) but you're still at the mercy of random partition events. An additional factor is of course getting very unlucky with dropped gossip events, which would also impact the rate of convergence across the cluster.

0 comments

sambigeara

No comments yet

Contribute on Hacker News ↗