Comment by k8sToGo
5 hours ago
It's not about outages. It's about the why. Hardware can fail. Bugs can happen. But to continue a roll out despite warning sings and without understanding the cause and impact is on another level. Especially if it is related to the same problem as last time.
And yet, it's always clownflare breaking everything. Failures are inevitable, which is widely known, therefore we build resilience systems to overcome the inevitable
It is healthy for tech companies to have outages, as they will build experience in resolving them. Success breeds complacency.