Comment by juancb

5 years ago

This may seem trite, but if you can get 20 servers worth of performance out of one you can afford to run two active-active and still reap a 10x capex/opex savings. The technology to have simple but reliable systems has been around for decades. You also can't assume that the cloud is going to never fail, so you always have to defend against failure whether it be running two servers or two availability zones.

Also a lot of the time you're not trying to achieve (and can't anyway) an uninterruptible uptime - you just need a rapid recovery from infrequent outages.

Apples to oranges - you’re not gonna get 10x saving by running a/a in a strong consistency mode

  • I don’t see how this follows. If you have one single-threaded server doing the job of 20 similarly specified servers running the distributed system, you could run every job twice, on two completely independent servers on two completely independent networks, and still be 10x as efficient unless both failed catastrophically during the same job. Or you could run three completely independent copies of the same job and still be at nearly 7x efficiency, etc. There is no need for any sort of “consistency mode” here. This is just brute force, without any synchronisation between servers or resumption of aborted jobs at all.