It's not about outages. It's about the why. Hardware can fail. Bugs can happen. But to continue a roll out despite warning sings and without understanding the cause and impact is on another level. Especially if it is related to the same problem as last time.
And yet, it's always clownflare breaking everything. Failures are inevitable, which is widely known, therefore we build resilience systems to overcome the inevitable
Like who? Which large tech company doesn't have outages?
Google does pretty good.
Google docs was just down a couple weeks ago almost the whole day.
It's not about outages. It's about the why. Hardware can fail. Bugs can happen. But to continue a roll out despite warning sings and without understanding the cause and impact is on another level. Especially if it is related to the same problem as last time.
And yet, it's always clownflare breaking everything. Failures are inevitable, which is widely known, therefore we build resilience systems to overcome the inevitable
1 reply →
"tripping on their own feet" == "not rolling back"