The Cloudflare outage might be a good thing

5 days ago (gist.github.com)

It would be a good thing, if it would cause anything to change. It obviously won't. As if a single person reading this post wasn't aware that the Internet is centralized, and couldn't name specifically a few sources of centralization (Cloudflare, AWS, Gmail, Github). As if it's the first time this happens. As if after the last time AWS failed (or the one before that, or one before…) anybody stopped using AWS. As if anybody could viably stop using them.

  • If anything, centralisation shields companies using a hyperscaler from criticism. You’ll see downtime no matter where you host. If you self host and go down for a few hours, customers blame you. If you host on AWS and “the internet goes down”, then customers treat it akin to an act of God, like a natural disaster that affects everyone.

    It’s not great being down for hours, but that will happen regardless. Most companies prefer the option that helps them avoid the ire of their customers.

    Where it’s a bigger problem is when a critical industry like retail banking in a country all choose AWS. When AWS goes down all citizens lose access to their money. They can’t pay for groceries or transport. They’re stranded and starving, life grinds to a halt. But even then, this is not the bank’s problem because they’re not doing worse than their competitors. It’s something for the banking regulator and government to worry about. I’m not saying the bank shouldn’t worry about it, I’m saying in practice they don’t worry about it unless the regulator makes them worry.

    I completely empathise with people frustrated with this status quo. It’s not great that we’ve normalised a few large outages a year. But for most companies, this is the rational thing to do. And barring a few critical industries like banking, it’s also rational for governments to not intervene.

    • > If anything, centralisation shields companies using a hyperscaler from criticism. You’ll see downtime no matter where you host. If you self host and go down for a few hours, customers blame you.

      Not just customers. Your management take the same view. Using hyperscalers is great CYA. The same for any replacement of internally provided services with external ones from big names.

      1 reply →

    • I think this really depends on your industry.

      If you cannot give a patient life saving dialysis because you don't have a backup generator then you are likely facing some liability. If you cannot give a patient life saving dialysis because your scheduling software is down because of a major outage at a third party and you have no local redundancy then you are in a similar situation. Obviously this depends on your jurisdiction and probably we are in different ones, but I feel confident that you want to live in a district where a hospital is reasonably responsible for such foreseeable disasters.

      1 reply →

    • >If anything, centralisation shields companies using a hyperscaler from criticism. You’ll see downtime no matter where you host. If you self host and go down for a few hours, customers blame you.

      What if you host on AWS and only you go down? How does hosting on AWS shield you from criticism?

      1 reply →

  • I’m pretty cloudflare centric. I didn’t start that way. I had services spread out for redundancy. It was a huge pain. Then bots got even more aggressive than usual. I asked why I kept doing this to myself and finally decided my time was worth recapturing.

    Did everything become inaccessible the last outage? Yep. Weighed against the time it saves me throughout the year I call it a wash. No plans to move.

    • I'm of a similar mindset... yeah, it's inconvenient when "everything" goes down... but realistically so many things go down now and then, it just happens.

      Could just as easily be my home's internet connection, or a service I need from/at work, etc. It's always going to be something, it's just more noticeable when it affects so many other things.

      1 reply →

  • > It would be a good thing, if it would cause anything to change. It obviously won't.

    I agree wholeheartedly. The only change is internal to these organizations (eg: CloudFlare, AWS) Improvements will be made to the relevant systems, and some teams internally will also audit for similar behavior, add tests, and fix some bugs.

    However, nothing external will change. The cycle of pretending like you are going to implement multi-region fades after a week. And each company goes on continuing to leverage all these services to the Nth degree, waiting for the next outage.

    Not advocating that organizations should/could do much, it's all pros/cons. But the collective blast radius is still impressive.

    • the root cause is customers refusing to punish these downtime.

      Checkout how hard customers punish blackouts from the grid - both via wallet, but also via voting/gov't. It's why they are now more reliable.

      So unless the backbone infrastructure gets the same flak, nothing is going to change. After all, any change is expensive, and the cost of that change needs to be worth it.

      25 replies →

    • And even in multi-region, you experience a DNS failure and it all goes up in flames anyway. There's always going to be something.

  • It’s just a function of costs vs benefits. For most people, building redundancy at this layer costs far too much than the benefits.

    If Cloudflare or AWS go down, the outage is usually so big that smaller players have an excuse and people accept that.

    It’s as simple as that.

    “Why isn’t your site working?” “Half the internet is down, here read this news article: …” “Oh, okay, let me know when it’s back!”

  • Same idea with the Crowdstrike bug, it seems like it didn't have much of on effect on their customers, certainly not with my company at least, and the stock quickly recovered, in fact doing very well. For me, it looks like nothing changed, no lessons learned.

  • With the rise in unfriendly bots on the internet as well as DDoS botnets reaching 15 Tbps, I don’t think many people have much of a choice.

  • > As if anybody could viably stop using them.

    To be fair AWS (and GCP and Azure) at least is easy to replace with something else. And pretty much all alternatives are cheaper, less messy, etc. There are very few situations where you cannot viably do so.

    We live in a world where you can get things like dedicated servers, etc. within similar time spans as creating a "compute engine" node on a big cloud provider.

    The fact that cloud services added serious limitations to what applications were able to do (things like state management, passing configuration in more unified ways, etc.) means that running your own infrastructure is easier than ever, since your devs won't end up whining at you until you do something super custom just for some project to be a bit easier. But if you really want to you can.

    GitHub also has become easy to get away from and indeed many individuals and companies did so.

    CDNs are the bigger thing but A) there are a lot of other CDNs and B) having an image, or lets say an ansible config allows you to quickly deploy something that might be close enough for your use case. Just take any hosting company or even a dozen around the world.

    Of course if you allowed yourself to end up in a complete vendor lock in things might be different, but if you think that it's a good idea to be completely dependent on the whims of some other company maybe you deserve that state. As in don't run a business without having any kind of fallback for decisions you make. Yes, profit from that big benefit something might give you, but don't lock the door behind you.

    Sure you might be lucky and sure maybe you are fine going for luck while it lasts. Just don't be surprised when it all shatters.

  • > As if anybody could viably stop using them.

    It is as easy to not use them as it ever was. There has been no actual centralisation. Everything is done using open protocols. I don't know what more you could want.

    Compare it to Windows where there is deep volume discounting and salespeople shmoozing CTOs and getting in with schools, healthcare providers etc etc. That's actual lock-in.

  • It’s too few and far between. It’s gonna make some changes if it’s a monthly event. If businesses start to lose connection for 8 hours every month, maybe the bigger ones are going to run for self hosting or at least some capacity of self hosting.

    • Yeah, agree. But even in case of 8 hour downtime (it's almost 99% SLA) it isn't beneficial for really small firms.

  • > It obviously won't.

    Here's where we separate the men from the boys, the women from the girls, the Enbys from the enbetts, and the SREs from the DevOps. If you went down when Cloudflare went do, do you go multicloud so that can't happen again, or do you shrug your shoulders and say "well, everyone else is down"? Have some pride in your work, do better, be better, and strive for greatness. Have backup plans for your backup plans, and get out of the pit of mediocrity.

    Or not, shit's expensive and kubernetes is too complicated and "no one" needs that.

    • You make the appropriate cost/benefit decision for your business and ignore apathy on one side and dogma on the other.

  • > As if anybody could viably stop using them.

    You can, and even save money.

  • Same with the big Crowdstrike fail of 2024. Especially when everyone kept repeating the laughable statement that these guys have their shit in order, so it couldn't possibly be a simple fuckup on their end. Guess what, they don't, and it was. And nobody has realized the importance of diversity for resilience, so all the major stuff is still running on Windows and using Crowdstrike.

Does the author of this post not see the irony of posting this content on Github?

My counter argument is that "centralization" in a technical sense isn't about what company owns things but how services are operated. Cloudflare is very decentralized.

Furthermore, I've seen regional outages caused by things like anchors dropped by ships in the wrong place, a shark eating a cable. Regional power outages caused by squirrels,etc... outages happen.

If everyone ran their own server from their own home, AT&T or Level3 could have an outage and still take out similar swathes of the internet.

With CDNs like cloudflare, if Level3 had an outage, your website won't be down because your home or VPS host's upstream transit happens to be Level3 (or whatever they call themselves these days) because your content (at least static) is cached globally.

The only real reasonable alternative is something like ipfs, web3 and similar talk.

Cloudflare has always called itself a content transport provider, think of it as such. But also, Cloudflare is just one player, there are several very big players. Every big cloud provider has a competing product, not to mention companies like Akamai.

People are rage posting about cloudflare, especially because it has made CDNs accessible to everyone. You can easily setup a free cloudflare account and be on your merry way. This isn't something you should be angry about. You're free to pay for any number of other cdns, many do.

If you don't like how Cloudflare has so much market share, then come up with a similarly competitive alternative and profit. Just this HN thread alone is enough for me to think there is a market for more players. Or, just spread the word about the competition that exists today. Use frontdoor, cloudfront, netlify, flycdn, akamai,etc... It's hardly a monopoly.

I don't know how many times I need to say this, but I will die on this hill.

Centralized services don't decrease redundancy. They're usually far more redundant than whatever homegrown solution you can come up with.

The difference between centralized and homegrown is mostly psychological. We notice the outages of centralized systems more often, as they affect everything at the same time instead of different systems at different times. This is true even if, in a hypothetical world with no centralization, we'd have more total outage time than we do now.

If your gas station says "closed" due to a problem that only affects their own networks, people usually go "aah they're probably doing repairs or something", and forget about the problem 5 minutes later. If there's a Cloudflare outage... everybody (rightly) blames the Cloudflare outage.

Where this becomes a problem is when correlated failures are actually worse than uncorrelated ones. If Visa goes down, it's better if Mastercard stays up, because many customers have both and can use the other when one doesn't work. In some ways, it's better to have 30 mins of Visa outages today and 30 mins of Mastercard outages tomorrow, than to have just 15 mins of correlated outages in one day.

  • "redundancy" might not be there correct word. If we had a single worldwide mega-entity serving 100% of the internet it would be both a monopoly and would have tons of redundant infrastructure.

    But it would also be quite unified; the system, while full of redundancies, as a whole is a unique one operated the same way end to end; by virtue of it being a single system handled in a uniform way, a single glitch could bring it all down. There is no diversity in the system's implementation, the monoculture itself makes it vulnerable.

  • The problem is creating a single point of failure.

    There's no doubt a VM in AWS is exponentially more redundant than my VM running on a couple of Intel NUCs in my closet.

    The difference is, when I have a major outage, my blog goes down.

    When EC2 has a major outage, all of the blogs go down. Along with Wikipedia, Starbucks, and half the internet.

    That single point of failure is the issue.

    • Single point of failure means exactly opposite of what you think it means. If my work depends on 5 services to be up, each service would be a single point of failure, and correlation of failure is good for probability that I can do my work.

      3 replies →

  • In my experience services aren't failing due to a lack of redundancy but due to an excess of complexity. With the move to the cloud we are continually increasing both redundancy and complexity and this is making the problem worse.

    I have a cheap VPS that has run reliably for a decade except for a planned hour of downtime. Which was in the middle of the night when no-one cared. Amazon is more reliable in theory. My cheap VPS is more reliable in practice.

Every HN comment seems to say the same thing: downtime is inexcusable and the centralization of these services is ruining the internet.

I still don't see the big deal. 12 hours of downtime once every couple years isn't the end of the world. So people can't log into their bank website for a few hours -- banks used to only be open for like 4 hours a day and somehow we all survived. Twitter is down? Oh what a tragedy. Customers get some refunds, Cloudflare fixes the issue, and people move on with life.

Cars still break down occasionally after 100+ years of engineering for reliability and safety. The power still goes out every now and then. Cook on the stove. The cost of making everything perfect all the time just isn't worth it.

I run my own servers on my own network and do not use Cloudflare. My stuff goes down too. And it's "decentralized" in the way you think the internet "should" be, which entails its own risks. So what do you all want, exactly? A public lashing of every developer at Cloudflare who pushes a bug to prod? A congressional investigation? I just don't understand the outrage here.

Stuff breaks occasionally. Get used to it, and design accordingly.

  • > So people can't log into their bank website for a few hours, banks used to only be open for like 4 hours a day and somehow we all survived.

    1. I believe it's payment processing systems not functioning properly that causes real problems for people and not simply bank websites being down. Especially given...

    2. Banks being closed so much back when cash/checks were actually widely used wasn't an issue because you could just pop over to an ATM or whip out a checkbook. In today's system, every single purchase you make requires communication between the merchant, your bank, and any number of middlemen via the internet.

    Yeah, cash is still used today but I've been noticing even things like school sports events have stopped taking cash all together and simply post a QR code to buy from your phone.

    That is unless the school has crap cell reception (with no public Wi-Fi either!), Cloudflare shits the bed, Visa thinks you're buying porn, you locked your debit card and now can't unlock it cuz the website is down, or any one of the million things that break all the time. Replace school sports event with literally every single things that requires a financial transaction and it's easy to see how even a short outage can lead to actual harm being realized.

  • From a consumers perspective, that makes sense. From a business's perspective, downtime can mean significant loss of revenue or new business opportunity.

    • The costs of perfection are much, much greater. Are you willing to pay 2-3x the cost of everything to go from 99.999% to 100.0000000% uptime?

      Probably the only thing in existence with 100.00% uptime are our nuclear missile command and control systems. Like, even my pen runs out of ink sometimes. It's just crazy how hard it is to have stuff work all the time.

      1 reply →

    • I wonder if consolidation actually makes this less of an issue for businesses?

      If my website is down, but my competitors' isn't, I might lose business to them. If my competitor's website is also down, where are the customers gonna go?

"The Cloudflare outage was a good thing [...] they're a warning. They can force redundancy and resilience into systems."

- he says. On Github.

Spot on article, but without a call to action. What can we do to combat the migration of society to a centralized corpro-government intertwined entity with no regard for unprofitable privacy or individualism?

  • Individuals are unlikely to be able to do something about the centralization problem except vote for politicians that want to implement countermeasures. I don’t know of any politicians (with a chance to win anything) that have that on their agenda.

    • There is a crucial step between having an opinion and voting. It's conversations within society. That's what makes democracy and facilitates change. If you only take your opiniom, isolated from everybody else, and vote from that, there isn't much democracy going on and your chance for change is slim. It's when there is broad conversations happening when movements have an impact.

      And that step is here on HN. That's why it's very relevant to observe that that HN crowd is increasingly happy to support a non-free internet. Be it walled gardens, geofencing, etc.

    • That’s called antitrust, and is absolutely a cause you can vote for. Some of the Biden administration’s biggest achievements were in antitrust, and the head of the FTC for Biden has joined Mamdani’s transition team.

  • Learn how to host anything, today.

    • Even if you learn to Host, there are many other services that are going to get relied on those centralised platforms, so if you are thinking to Host, every single thing on your own, then it is going to be more work than you can even imagine and definitely super hard to organise as well

      1 reply →

    • If you host you are running on my cPanel SW. 70% of the internet is doing that. Also a kinda centralized point of failure, but I didn't hear of any bugs in the last 14 years.

    • Have you tried that? I gave up on hosting my own email server seven or eight years ago, after it became clear that there would be an endless fight with various entities to accept my mail. Hosting a webserver without the expectation that you'll need some high powered DDOS defense seems naive, in the current day, and good luck doing that with a server or two.

      3 replies →

So were going backwards to a world where there are basically 5 computers running everything and everyone is basically accessing the world through a dumb terminal.Even though the digital slab in our pockets has more compute than a roomful of the early gen devices. Hopefully critical infrashifts back to managed metal or private clouds - dont see it though with the last decades of cloud evangalism to move all legacy systems to the cloud doesnt look like reversing anytime soon.

  • Yeah it's crazy to realize it takes a room of electronics for me to get my (g)mail. The more things change, the more they stay the same, eh?

  • I agree considering all the Cloudflare AWS Azure apologists I see all around... Learning AWS already is the #1 tip on social media to "become employed as a dev in 2025 guaranteed" and I always just sigh when seeing this. I wouldnt touch it with a stick.

"Embrace outages, and build redundancy." — It feels like back in the day this was championed pretty hard especially by places like Netflix (Chaos Monkey) but as downtime has become more expected it seems we are sliding backwards. I have a tendency to rely too much on feelings so I'm sure someone could point me to some data that proves otherwise but for now that's my read on things. Personally, I've been going a lot more in on self-hosting lots of things I used to just mindlessly leave on the cloud.

  • I have cell phone calls regularly drop during tower handoffs, and codec errors that result in a blast of static upon answering a call. I can't remember a single time I had a phone call fail on the old PSTN built out of DMS10 and DMS100s locally (well, until we lost all trunks due to a fibre issue a couple of weeks ago on November 10th -- the incumbent didn't notice the outage which started at ~3:20am until ~9:30am, and it wasn't fixed until 17:38). One time when I was a teenager in the '90s, a friend and I had a 14 hour call using landlines.

    The modern tech stack is disappointing in its lack of reliability. Complexity is the root of all evil.

I don't get why this applies on the Cloudflare outage but not on the AWS ones... I'd argue that the big cloud providers are WAY more impactful when they go down than Cloudflare. The only difference is that the average techie uses Cloudflare more and sees the impact more, but this point was already there before...

What happens if you don't use Cloudflare and just host everything on a server?

Can't you run a website like that if you don't host heavy content?

How common are DDOS attacks anyway, and aren't there local (to the server), that analyze user behavior to a decent accuracy (at least it can tell they're using a real browser and behaving more or less like a human would, making attacks expensive).

Can't you buy a list of ISP ranges from a GeoIP provider (you can), at least then you'd know which addresses belong to real humans.

I don't think botnets are that big of a problem (maybe in some obscure places of the world, but you can temp rangeban a certain IP range, if there's a lot of suspicious traffic coming from there).

If lots of legit networks (as in belonging to people who are paying an ISP for their network connections) have botnets, that's means most PCs are compromised, which is a much more severe issue.

  • > What happens if you don't use Cloudflare and just host everything on a server?

    It works.

    > Can't you run a website like that if you don't host heavy content?

    Even with a heavy content - question is how many visitors do you have. If there is one once an hour you would suffice on a 100Mbit/Unlim connection.

    > How common are DDOS attacks anyway

    Extremely rare. 99% of sites never experience it, 1% do have some trouble because somebody nearby is being DDoS'ed.

    > and aren't there local (to the server), that analyze user behavior to a decent accuracy (at least it can tell they're using a real browser and behaving more or less like a human would, making attacks expensive).

    No point, you can't do anything anyway - it's a denial of service so there are gigabytes of trash flowing your way.

    > Can't you buy a list of ISP ranges from a GeoIP provider (you can), at least then you'd know which addresses belong to real humans.

    No point. If you are not being DDoS'ed then you just spent money and time (ie money) on useless preventive measure you never use. And when (if) it would come you can't do anything anyway, because it's a distributed denial of service attack.

    > I don't think botnets are that big of a problem (maybe in some obscure places of the world, but you can temp rangeban a certain IP range, if there's a lot of suspicious traffic coming from there).

    It's not a DDoS if you can filter at the endpoint.

  • Yeah, you can.

    Lots of people use raspberry pi’s for this, which is a smidge anaemic for some decent load (HN Hug Of Death)- even an Intel N100 is more grunt, for context.

    This makes people think that their self hosting setup can never handle HN load; because when they see people talking about self hosting the site goes down.

    • Most people shouldn't use a Pi because most people can't configure a web server securely. A VPS would be a better option for just about everybody trying to "self-host" whether they put Cloudflare in front of it or not.

      5 replies →

  • Botnets use real residential connections not just data centers. So your static list of “real people” doesn’t really make a difference.

  • voip.ms was pretty much offline for a couple of weeks while under a lengthy DDoS attack. They were only able to restore service by putting all their servers behind Cloudflare proxies to mitigate the ongoing DDoS.

It's worth considering the counter factual. Let's say there would be a few dozen semi popular DDoS services. Would that be better? Some assumptions: The services would be slightly less effective and also have worse downtimes. You could argue that Cloudflare is coasting on a monopoly and that competition would drive them to improve, but I'm pretty confident that DDoS protection it one of those things were having a large network to absorb attacks and a large team to monitor them if very valuable. I submit as evidence that Cloudflare has been doing well despite the 3 big cloud providers offering DDoS protection.

So what would be the result of a highly decentralized but slightly worse and less reliable DDoS protection? I'd argue that for a lot of things this wouldn't be an improvement. Cloudflare being so dominant means lot's of things go down simultaneously. But that only matters for fungible services, e.g. if a schools education portal goes down, it doesn't matter if all the other education portals are also down. There are cases where it matters like the tyre pumps. I'd argue that these devices have no reason to be reliant on an online connection to begin with. I think cloud services as a whole have massively improved the reliability of internet services. In almost all cases reducing the overall amount of outages is a higher priority than preventing outage correlations.

The problem is far more nuanced than the internet simply becoming too centralised.

I want to host my gas station network’s air machine infrastructure, and I only want people in the US to be able to access it. That simple task is literally impossible with what we have allowed the internet to become.

FWIW I love Cloudflare’s products and make use of a large amount of them, but I can’t advocate for using them in my professional job since we actually require distributed infrastructure that won’t fail globally in random ways we can’t control.

  • > and I only want people in the US to be able to access it. That simple task is literally impossible with what we have allowed the internet to become.

    Is anyone else as confused as I am about how common anti-openness and anti-freedom comments are becoming on HN? I don’t even understand what this comment wants: Banning VPNs? Walling off the rest of the world from US internet? Strict government identity and citizenship verification of people allowed to use the internet?

    It’s weird to see these comments get traction after growing up in an internet where tech comments were relentlessly pro freedom and openness on the web. Now it seems like every day I open HN and there are calls to lock things down, shut down websites, institute age (and therefore identify) verification requirements. It’s all so foreign and it feels like the vibe shift happened overnight.

    • > Is anyone else as confused as I am about how common anti-openness and anti-freedom comments are becoming on HN?

      In this specific case I don't think it's about being anti-open? It's that a business with only physical presence in one country selling a service that is only accessible physically inside the country.... doesn't.... have any need for selling compressed air to someone who isn't like 15 minutes away from one of their gas stations?

      If we're being charitable to GP, that's my read at least.

      If it was a digital services company, sure. Meatspace in only one region though, is a different thing?

      6 replies →

    • > It’s all so foreign and it feels like the vibe shift happened overnight.

      The cultural zeitgeist around the internet and technology has changed, unfortunately. But it definitely didn't happen overnight. I've been witnessing it happen slowly over the past 8-10 years, with it accelerating rapidly only in the last 5.

      I think it's a combination of special interest groups & nation states running propaganda campaigns, both with bots and real people, and a result of the internet "growing up." Once it became a global, high-stakes platform for finance and commerce, businesses took over, and businesses are historically risk averse. Freedom and openness is no longer a virtue but a liability (for them).

  • > I want to host my gas station network’s air machine infrastructure, and I only want people in the US to be able to access it. That simple task is literally impossible with what we have allowed the internet to become.

    That task was never simple and is unrelated to Cloudflare or AWS. The internet at a fundamental level only knows where the next hop is, not where the source or destination is. And even if it did, it would only know where the machine is, not where the person writing the code that runs on the machine is.

    • And that is a good thing and we should embrace it instead of giving in to some idiotic ideas from a non-technical C-suite demanding geofencing.

  • Genuine question - why are you spending time and effort on geofencing when you could spend it on improving your software/service?

    It takes time and effort for no gain in any sensible business goal. People outside of US won't need it, bad actors will spoof their location, and it might inconvenience your real customers.

    And if you want a secure communication just setup zero-trust network.

    • > bad actors will spoof their location

      Isn't that exactly the point? Why are North Korean hackers even allowed to connect to the service, and why is spoofing location still so easy and unverifiable?

      Nobody is expected to personally secure their physical location against hostile state actors. My office is not artillery proof, nor does it need to be: hostile actions against it would be an act of war and we have the military to handle those kind of things. But with cybersecurity suddenly everyone is expected to handle everyone from the script kiddie next door to the Mossad. I see the point in OPs post: perhaps it would be good if locking down were a little easier than "just setup zero-trust network".

      6 replies →

  • not a sysadmin here. why wouldn't this be behind a VPN or some kind of whitelist where only confirmed IPs from the offices / gas stations have access to the infrastructure?

    • In practice, many gas stations have VPNs to various services, typically via multiple VPN links for redundancy. There’s no reason why this couldn’t be yet another service going over a VPN.

      Gas stations didn’t stop selling gas during this outage. They have planned for a high degree of network availability for their core services. My guess is this particular station is an independent or the air pumping solution not on anyone’s high risk list.

  • Literally impossible? On the contrary; Geofencing is easy. I block all kind of nefarious countries on my firewall, and I don't miss them (no loss not being able to connect to/from a mafia state like Russia). Now, if I were to block FAMAG... or Cloudflare...

    • Yes, literally impossible. The barrier to entry for anyone on the internet to create a proxy or VPN to bypass your geofencing is significantly lower than your cost to prevent them.

      6 replies →

    • It is definitely "literally impossible" if your acceptable false positive and false negative rates are zero.

      Having said that, vanishingly few companies/projects require that. For probably 99+% of websites, just using publicly available GeoIP databases to block countries will work just fine, so long as you don't pretend to yourself that North Korean or Chinese or Russian (or wherever) web users (or attackers) cannot easily get around that. And you'll also need to accept that occasionally a "local/wanted" user will end up with an IP address that gets blocked due to errors in the database.

      I worked on a project a decade or so back where we needed to identify which (Australian) state a website user was in, to correctly display total driveaway prices including all state taxes/charges (stamp duty, ctp insurance, and registration) for new cars. The MaxMind GeoIP database was not all that accurate at a state or city level, especially for mobile devices with CGNATed IP addresses. We ended up with "known errors and estimates of error rates", and a way for our Javascript to detect some of the known problems (like Vodafone's national CGNAT IP addresses) and popped up a "We detected you're in NSW, and are displaying NSW pricing. Click here to change state." message where we could, and got legal signoff that we could claim "best effort" at complying with the driveway price laws. 100% compliance with the laws as-written was "literally impossible" with zero error rates.

  • Client side SSL certificates with embedded user account identification are trivial, and work well for publicly exposed systems where IPsec or Dynamic frame sizes are problematic (corporate networks often mangle traffic.)

    Accordingly, connections from unauthorized users is effectively restricted, but is also not necessarily pigeonholed to a single point of failure.

    https://www.rabbitmq.com/docs/ssl

    Best of luck =3

  • Is Cloudflare having more outages than aws, gcp or azure? Honestly curious, I don't know the answer.

    • Definitely not.

      I was a bit shocked when my mother called me for IT help and sent me a screenshot of a Cloudflare error page with Cloudflare being the broken link and not the server. I assumed it's a bug in the error page and told her that the server is down.

  • I absolutely hate companies thinking they are being smart by blocking foreign IPs from using their websites.

    Every single time I want to order a burger from the local place, I have to use a VPN to fake being in the country (even though I actually am already physically here) so that it will let me give them my money.

    My phone's plan is not from here, so my IP address is actually not geographically in the same place as me.

I wonder what would life without cloudflare look like? What practices would fill the gaps if a company didn't - or wasn't allowed to -- satisfy the the concerns that cloudflare fills.

  • Pretty much exactly like it does now but with less captchas.

    Fun fact: Headless browsers can easily pass cloudflare captchas automatically. They're not actually captchaing - they're just a placebo. You just need to be coming from a residential IP address and using a real browser.

    • > Pretty much exactly like it does now but with less captchas.

      This just isn't true. e.g. I saw a 30x increase in traffic on my forum due to AI bots that I had to use CF to block.

      CF is mainly empowered by the naive ideals of the internet's design that never built-in countermeasures against bad actors. You're expected to just deal with it yourself somehow. And that means outsourcing it, especially as residential IP address botnets on unlimited ISP data plans become cheaper and cheaper.

      Just ask yourself why web hosting providers themselves can't offer services at CF's level. It's because it's too hard of a problem even for them.

      2 replies →

I'll die on the hill that centralization is more efficient than decentralization and that rare outages of hugely centralized systems that are otherwise highly reliable are much better than full decentralization with much worse reliability.

In other words, when AWS or Cloudflare go down it's catastrophic in the sense that everyone sees the issues at the same time, but smaller providers usually have much more ongoing issues, that just happen to be "chronic" vs "acute" pains.

  • Efficient in terms of what, exactly?

    There are multiple dimensions to this problem. Putting everything behind Cloudflare might give you better uptime, reliability, performance, etc. but it also has the effect of centralizing power into the hands of a single entity. Instead of twisting the arms of ten different CXOs, your local politician now only needs to twist the arm of a single CXO to knock your entire business off the internet.

    I live in India, where the government has always been hostile to the ideals of freedom of speech and expression. Complete internet blackouts are common in several states, and major ISPs block websites without due process or an appeals mechanism. Nobody is safe from this, not even Github[1]. In countries like India, decentralization is a preventative measure.

    [1] https://en.wikipedia.org/wiki/Censorship_of_GitHub#India

    And I'm not even going to talk about abuse of monopoly power and all that. What happens when Cloudflare has their Apple moment? When they jack up their prices 10x, or refuse to serve customers that might use their CDNs to serve "inappropriate" content? When the definition of "inappropriate" is left fuzzy, so that it applies to everything from CSAM to political commentary?

    No thanks.

  • >I'll die on hill that hyperoptimized systems are more efficient than anti-fragile.

    Of course they are, the issue is what level of failure were going to accept.

  • And the irony is that people are pushing for decentralization like microservices and k8s - on centralized platforms like AWS.

Now just wait til every country on earth really does replace most of its employees with ChatGPT... and then OpenAI's data center goes offline with a fiber cut or something. All work everywhere stops. Cloudflare outage is nothing compared to that.

  • That was this outage. ChatGPT and Claude are both behind Clouflare’s bot detection. You couldn’t log into either Web frontends.

    And the error message said you were blocking them. We had support tickets coming in demanding to know why ChatGPT was being blocked.

    We also couldn’t log into our supplier’s B2B system to place our customer orders.

    So all the advice of “just self host” is moot when you’re in a food web.

  • > goes offline with a fiber cut

    If a fiber cut brings your network down then you have fundamental network design issues and need to change hiring practices.

For me personally I didn't notice the downtime in the first hour or so. When using some website assets were not loading, but that's it. Turnstile outage maybe impacted me most. Could be because I'm EU based and Cloudflare is not "so" widespread here as in other parts of the world.

meta: why are we rewriting such anodyne titles? “was” -> “might be” undermines the author's point

The outage wasn’t a good thing, since nothing is changing as a result. (How many outages does cloud flare had?)

I don't like this argument since you can applied this argument to google,microsot,aws,facebook etc

Tech world is dominated by US company and what is alternative to most of these service???? its a lot fewer than you might think and even then you must make a compromise in certain areas

If these systems are as important as they say, it's surprising to me that they are not built with backups and redundancies in place like other mission critical things are engineered and built with.

feels like the main message is missed by seeing most of the discourse here:

> Outages like today's are a good thing because they're a warning. They can force redundancy and resilience into systems.

the advice is not to shun big companies and providers, but rather have a backup solution built-in for situations like this. switching solely to an in-house alternative is not always a great idea, but it can be a great backup solution.

Yeah, when it went down, a bunch of the sites I use every day just stopped working.

That’s when I realized it’s basically one of the backbone pieces of the entire internet.

It's a tragedy of the commons. Even if you don't use Cloudflare does it matter if no one can pay for your products.

> They [outages] can force redundancy and resilience into systems.

They won’t until either the monetary pain of outages becomes greater than the inefficiency of holding on to more systems to support that redundancy, or, government steps in with clear regulation forcing their hand. And I’m not sure about the latter. So I’m not holding my breath about anything changing. It will continue to be a circus of doing everything on a shoestring because line must go up every quarter or a shareholder doesn’t keep their wings.

>It's ironic because the internet was actually designed for decentralisation, a system that governments could use to coordinate their response in the event of nuclear war

This is not true. The internet was never designed to withstand nuclear war.

  • Arpanet absolutely was designed to be a physically resilient network which could survive the loss of multiple physical switch locations.

  • ARPANET was literally invented during the cold war for the specific and explicit purpose of networked communications resilience for government and military in the event major networking hubs went offline due to one or more successful nuclear attacks against the United States

  • Perhaps. Perhaps not. But it will survive it. It will survive a complete nuclear winter. It's too useful to die, and will be one the first things to be fixed after global annihilation.

    But Internet is not hosting companies or cloud providers. Internet does not care if they don't build their systems resilient enough and let the SPOFs creep up. Internet does it's thing and the packets keep flowing. Maybe BGP and DNS could use some additional armoring but there are ways around both of them in case of actual emergency.

how many people are still on us-east-1

  • My old employer used azure. It irritated me to no end when they said we must rename all our resources to match the convention of naming everything US East as "eu-" because (Eastern United States I guess)

    A total clown show

Outages like this highlight just how much of the internet’s resilience depends on a single provider. In a way, it’s a healthy reminder: if one company’s hiccup can take down half the web, maybe we’ve over‑centralized. A “good thing” only if it sparks more serious conversations about redundancy, multi‑provider strategies, and reducing monoculture risk. Otherwise, we’ll just keep repeating the same failure modes at larger scales.

Centralization has nothing to do with the problems of society and technology. And if you think the internet is all controlled by just a couple companies, you don't actually understand how it works. The internet is wildly decentralized. Even Cloudflare is. It offers tons of services, all of which are completely optional and can be used individually. You can also stop using them at any time, and use any of their competitors (of which there are many).

If, on the off chance, people just get "addicted" to Cloudflare, and Cloudflare's now-obviously-terrible engineering causes society to become less reliable, then people will respond to that. Either competitors will pop up, or people will depend on them less, or governments will (finally!) impose some regulations around the operation of technical infrastructure.

We have actually too much freedom on the Internet. Companies are free to build internet systems any way they want - including in very unreliable ways - because we impose no regulations or standards requirements on them. Those people are then free to sell products to real people based on this shoddy design, with no penalty for the products falling apart. So far we haven't had any gigantic disasters (Great Chicago Fire, Triangle Shirtwaist Factory Fire, MGM Grand Hotel Fire), but we have had major disruptions.

We already dealt with this problem in the rest of society. Buildings have building codes, fire codes, electrical codes. They prescribe and require testing procedures, provide standard building methods to ensure strength in extreme weather, resist a spreading fire long enough to allow people to escape, etc. All measures to ensure the safety and reliability of the things we interact with and depend on. You can build anything you want - say, a preschool? - but you aren't allowed to build it in a shoddy manner. We have that for physical infrastructure; now we need it for virtual infrastructure. A software building code.

  • Centralization means having a single point of failure for everything. If your government, mobile phone or car stops working, it doesn't mean all governments, all cars and all mobile phones stop working.

    Centralization makes mass surveillance easier, makes selectively denying of service easier. Centralization also means that once someone hacks into the system, he gains access to all data, not just a part of it.

i hate that i cannot just scrape things for me usage and i have to use things like camufox instead of curl

The thing I learned from the incident is that rust offer a unpack function. It puzzles me why the hell they build such a function in the first place.

  • > It puzzles me why the hell they build such a function in the first place.

    One reason is similar to why most programming languages don't return an Option<T> when indexing into an array/vector/list/etc. There are always tradeoffs to make, especially when your strangeness budget is going to other things.