Comment by YetAnotherNick
5 days ago
Single point of failure means exactly opposite of what you think it means. If my work depends on 5 services to be up, each service would be a single point of failure, and correlation of failure is good for probability that I can do my work.
I see what you're saying but I have to push back.
"If one thing I need is going to be down, everything might as well be down."
If I have a product with 5 dependencies and one of them is down, there's things I can do to partially mitigate. A circuit breaker would allow my thing to at least stay up and responsive. Maybe I could get a status message up and turn off a feature flag to disable what calls that dependency.
On the other hand, if all my dependencies are down AND the management layer is down AND the AWS portal is not functioning correctly, I'm pretty much SOL.
Massive centralization is never, ever a good thing for anyone other than the ones who are doing the centralizing.
So if you can just run without one service, what's stopping you to remove the dependency altogether. Why would you only want to remove the dependency when service is down.
So e.g. to get real my application depends on AWS's EC2, RDS, EKS, S3 Cloudflare's DNS, and Redis' instance. If any of those stop working it will go down. If everyone is within SLA, they might as well go down together than separately.
This is a really interesting point, because I could see a situation where your application requires integration with say 10 services. If they all run on AWS, they either all go down or all run together. If they're all self-hosted, there's a good chance that at any time one of the ten is down, and so your service can't run.