← Back to context

Comment by AnimalMuppet

4 years ago

You can have a complicated system that spends its complexity "budget" on redundancy and/or protection mechanisms. This means that when A fails, B keeps things working, and when B fails, C does. And so the system just keeps going, with B failed, and F, and K, and Q and X and Z. And maybe nobody (or very few people) notice that there's all these failing subsystems adding up.

And then A fails, but hey, it's still running great!

And then C fails. And the system collapses, because A, B, and C all failed. And everybody thinks that it collapsed quickly, because nobody thinks of the collapse as starting when B failed.

TL;DR: A complex redundant system can run for a long time in a partially-failed state. If you measure only from the start of full failure, you can miss how long the collapse took.