Comment by skrebbel

2 years ago

Yeah so I think it might’ve been a real system limit of sorts. Something timing out somewhere, some pipe getting clogged in a way that their edge nodes couldn’t scale their way out of the way they usually do. Eg because the scaling/monitoring code didn't detect that particular pipe getting clogged etc. We had weird long-running http requests at the time.

Note, this is pure conjecture, I’m just well aware from my own engineering experience that stuff can break under varied load in all kinds of unexpected ways. A large part of the work of an infrastructure business is going “woa shit I hadnt expected that we could fail in that way too” and then building infrastructure to be able to handle that case. You simply can’t predict everything your customers are going to throw at you. I think this was what happened + not sufficiently knowledgeable/experienced support. But I admit that I’m really just guessing.

The alternative would be that CF purposefully dropped 10% of our traffic to convince us to upgrade to enterprise, and despite our bad experience, I don’t believe they’re that kind of business. And if they were they handled it very badly because it took them 3 weeks of feet-dragging to even bring up the upsell.

2 comments

skrebbel

Aeolun 2 years ago

> I don’t believe they’re that kind of business

I didn’t either, but then I read this post :/

skrebbel 2 years ago

Fwiw I still don’t. Large companies mess up too sometimes. This is what it looks like when a sales team messes up.