← Back to context

Comment by pell

10 hours ago

It’s that time of the year again where we all realize that relying on AWS and Cloudflare to this degree is pretty dangerous but then again it’s difficult to switch at this point.

If there is a slight positive note to all this, then it is that these outages are so large that customers usually seem to be quite understanding.

Unless you’re say at airport trying to file a luggage claim … or at the pharmacy trying to get your prescription. I think as a community we have a responsibility to do better than this.

  • > I think as a community we have a responsibility to do better than this.

    I have always felt so, but my opinion is definitely in the minority.

    In fact, I find that folks have extremely negative responses to any discussion of improving software Quality.

    • I always see such negative responses when HN brings up software bloat ("why is your static site measured in megabytes").

      Now that we have an abundance of compute and most people run devices more powerful than the devices that put man on the moon, it's easier than ever to make app bloat, especially when using a framework like Electron or React Native.

      People take it personally when you say they write poor quality software, but it's not a personal attack, it's an observation of modern software practices.

      And I'm guilty of this, mainly because I work for companies that prioritize speed of development over quality of software, and I suspect most developers are in this trap.

      2 replies →

    • Merely reducing external dependencies causes people to come out in rashes.

      A large proportion of “developers” enjoy build vs buy arguments far too much.

  • You aren’t cloudflare’s customer in these examples. It depends on the companies that are actually paying for and using the service to complain. Odds are that they won’t care on your behalf due to how our society is structured.

    Not really sure how our community is supposed to deal with this.

    • “We” are the ones making the architecture and the technical specs of these services. Taking care for it to still work when your favourite FAANGMC is down seems like something we can help with.

> If there is a slight positive note to all this, then it is that these outages are so large that customers usually seem to be quite understanding.

Which only shows that chasing five 9s is worthless for almost all web products. The idea is that by relying on AWS or Cloudflare you can push your uptime numbers up to that standard, but these companies themselves are having such frequent outages that customers themselves don't expect that kind reliability from web products.

If I choose AWS/cloudflare and we're down with half of the internet, then I don't even need to explain it to my boss' bosses, because there will be an article in the mainstream media.

If I choose something else, we're down, and our competitors aren't, then my overlords will start asking a lot of questions.

  • Yup. AWS went down at a previous job and everyone basically took the day off and the company collectively chuckled. Cloudflare is interesting because most execs don’t know about it so I’d imagine they’d be less forgiving. “So what does cloudflare do for us exactly? Don’t we already have aws?”

  • And if everyone else is down, and you are not, you will get no credit.

    • Or _you_ aren't down, but a third-party you depend on is (auth0, payment gateway, what have you), and you invested a lot of time and effort into being reliable, but it was all for less than nothing, because your website loads but customers can't purchase, and they associate the problem with you, not with the AWS outage.

      1 reply →

  • In reality it is not half of the internet. That is just marketing. I've personally noticed one news site while others were working. And I guess sites like that will get the blame.

Happy to hear anyone's suggestions about where else to go or what else to do in regards to protecting from large-scale volumetric DDoS attacks. Pretty much every CDN provider nowadays has stacked up enough capacity to tank these kind of attacks, good luck trying to combat these yourself these days?

  • Somehow KiwiFarms figured it out with their own "KiwiFlare" DDOS mitigation. Unfortunately, all of the other Cloudflare-like services seem exceptionally shady, will be less reliable than Cloudflare, and probably share data with foreign intelligence services I have even less trust for than the ones Cloudflare possibly shares them with.

  • Anubis and/or Bunny are good alternatives/combination depending on your exact needs

    - https://anubis.techaro.lol/

    - https://bunny.net/

    • Unfortunately Anubis doesn't help where my pipe to the internet isn't fat enough to just eat up all the bandwidth that the attacker has available. Renting tens of terabits of capacity isn't cheap and DDoS attacks nowadays are in the scale of that. BunnyCDN's DDoS protection is unfortunately too basic to filter out anything that's ever so slightly more sophisticated. Cloudflare's flexibility in terms of custom rulesets and their global pre-trained rulesets (based on attacks they've seen in the past) is imo just unbeatable at this time.

      2 replies →

    • Why do people on a technical website suggest this? It's literally the same snake oil as Cloudflare. Both have an endgame of total web DRM; they want to make sure users "aren't bots". Each time the DRM is cracked, they will increase its complexity of the "verifier". You will be running arbitrary code in your big 4 browser to ensure you're running a certified big 4 browser, with 10 trillion man hours of development, on an certified OS.

      2 replies →

  • Just accept that a DDoS might happen and that there's nothing you can do about it. It's fine, it's just how the Internet works.

    • That was possible when a DDos was usually still an occasional attack by a bad actor.

      Most time I get ddosed now it's either Facebook directly, Something something Azure or any random AI.

      7 replies →

Oh no, we had 30 minutes of downtime this year :(

  • 5 9's is like 7 minutes a year. They are breaking SLAs and impacting services people depend on

    Tbh though this is sort of all the other companies fault, "everyone" uses aws and cf and so others follow. now not only are all your chicks in one basket, so is everyone elses. When the basket inevitably falls into a lake....

    Providers need to be more aware of their global impact in outages, and customers need to be more diverse in their spread.

  • I do think this is tenable as long as these services are reliable. Even though there have been some outages I would argue that they’re incredibly reliable at this point. If though this ever changes the costs to move to a competitor won’t be as simple as pushing a repository elsewhere, especially for AWS. I think that’s where some of the potential danger lies.

    • > 30 minutes of downtime

      > this is tenable as long as these services are reliable

      do you hear yourself, this is supposed to be a distributed CDN. imagine if HTTP had 30 minutes of downtime a year.

      and judging by the HN post age, we're now past minute 60 of this incident.

      4 replies →

    • > especially for AWS

      CF can be just as difficult if not more to migrate off of especially when using things like durable objects