← Back to context

Comment by woliveirajr

6 years ago

> Unfortunately, last Tuesday’s update contained a regular expression that backtracked enormously and exhausted CPU used for HTTP/HTTPS serving.

One of those cases where they had 1 problem, used regular expression and ended up with 2 problems ?

Edit: I really like how much information is given by CloudFlare. 11 points in the "what went wrong analysis" is how every root-cause analysis should be done.

Somewhat humorous, as someone [1] (congrats /u/fossuser!) mentioned this failure scenario in the thread about Twitter being down yesterday.

"Pushing bad regex to production, chaos monkey code causing cascading network failure, etc.", in response to a comment from someone who previously worked at Cloudflare.

[1] https://news.ycombinator.com/item?id=20415608

I agree this is an awesome post and a really great example of how every Root Cause Analysis needs to be done. I am also impressed by their incident response.