Comment by theideaofcoffee
4 hours ago
Same, my time at a F100 ecommerce retailer showed me the same. Every change control board justification needed an explicit back-out/restoration plan with exact steps to be taken, what was being monitored to ensure that was being held to, contacts of prominent groups anticipated to have an effect, emergency numbers/rooms for quick conferences if in fact something did happen.
The process was pretty tight, almost no revenue-affecting outages from what I can remember because it was such a collaborative effort (even though the board presentation seemed a bit spiky and confrontational at the time, everyone was working together).
And you moved at a glacial pace compared to Cloudflare. There are tradeoffs.
Yes, of course, I want the organization that inserted itself into handling 20% of the world's internet traffic to move fast and break things. Like breaking the internet on a bi-weekly basis. Yep, great tradeoff there.
Give me a break.
While you're taking your break, exploits gain traction in the wild and one of the value propositions for using a service provider like CloudFlare is catching and mitigating theses exploits as fast as possible. From the OP, this outage was in relation to handling a nasty RCE.
But if your job is mitigate attacks/issues then things can very broken while you're being slow to mitigate it.
Lest we forget, they initially rose to prominence by being cheaper than the existing solutions, not better, and I suppose this is a tradeoff a lot of their customers are willing to make.
This sounds just as bad as yolo-merges, just on the other end of the spectrum.