Comment by pclmulqdq
3 years ago
I think this is the right approach, and I really admire the work you do at ScyllaDB. For something truly critical, you really do want to have multiple nodes available (at least 2, and probably 3 is better). However, you really should want to have backup copies in multiple datacenters, not just the one.
Today, if I were running something that absolutely needed to be up 24/7, I would run a 2x2 or 2x3 configuration with async replication between primary and backup sites.
Exactly. Regional distribution can be vital. Our customer Kiwi.com had a datacenter fire. 10 of their 30 nodes were turned to a slag heap of ash and metal. But 20 of 30 nodes in their cluster were in completely different datacenters so they lost zero data and kept running non-stop. This is a rare story, but you do NOT want to be one of the thousands of others that only had one datacenter, and their backups were also stored there and burned up with their main servers. Oof!
https://www.scylladb.com/2021/03/23/kiwi-com-nonstop-operati...