Comment by bumblehean

3 days ago

>Is Azure really this unreliable? There are concrete numbers in this blog. For those who use Azure, does it match your external experience?

IME, yes.

I'm currently working as an SRE supporting a large environment across AWS, Azure, and GCP. In terms of issues or incidents we deal with that are directly caused by cloud provider problems, I'd estimate that 80-90% come from Azure. And we're _really_ not doing anything that complicated in terms of cloud infrastructure; just VMs, load balancers, some blob storage, some k8s clusters.

Stuff on Azure just breaks constantly, and when it does break it's very obvious that Azure:

1. Does not know when they're having problems (it can take weeks/months for Azure to admit they had an outage that impacted us)

2. Does not know why they had problems (RCAs we're given are basically just "something broke")

3. Does not care that they had problems

Everyone I work with who interacts with Azure at all absolutely loathes it.

But doesn’t this experience contradict what OP is saying in a way. If azure is always breaking wouldn’t that imply that changes like “adding smart pointers” are being introduced into the codebase?

  • I don't think it contradicts the OP. OP says the system is unreliable. Memory leaks that lead to out of memory failures for example. Smart pointers would stabilize things. (Also note that OP says their smart pointers PR was rejected).

    • That's a generalized statement. Smart pointers can stabilize things, if used wrongly they can cause just as many issues. Sprinkling in smart pointers such that there is now mixed use with smart and raw pointers can cause double frees, and huge maintenance issues. So, creating a single PR to introduce smart pointers in my opinion is not necessarily "stability". He should have created an architecture plan and got upstream and downstream aligned.

      1 reply →