Comment by arscan

2 hours ago

I interned at a company called Stratus which did hardware fault tolerant computers in the 80s/90s. I think they called it a “Pair and spare” approach, where every component had 3 copies running and comparing state every cycle. If one component’s state stopped matching the other 2, the failing component would be taken offline and the system would call home for a replacement to be fedexed overnight. I think just about every component was hotswappable too. Pretty cool, but expensive, and other architectures for improving availability, or mitigating impact from loss of availability, won out (except for a handful of exotic use cases).