Cloudflare Global Network experiencing issues

4 hours ago (cloudflarestatus.com)

If anyone needs commands for turning off the CF proxy for their domains and happens to have a Cloudflare API token.

First you can grab the zone ID via:

    curl -X GET "https://api.cloudflare.com/client/v4/zones" -H "Authorization: Bearer $API_TOKEN" -H "Content-Type: application/json" | jq -r '.result[] | "\(.id) \(.name)"'

And a list of DNS records using:

    curl -X GET "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records" -H "Authorization: Bearer $API_TOKEN" -H "Content-Type: application/json"

Each DNS record will have an ID associated. Finally patch the relevant records:

    curl -X PATCH "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records/$RECORD_ID" -H "Authorization: Bearer $API_TOKEN" -H "Content-Type: application/json" --data '{"proxied":false}'

Copying from a sibling comment - some warnings:

- SSL/TLS: You will likely lose your Cloudflare-provided SSL certificate. Your site will only work if your origin server has its own valid certificate.

- Security & Performance: You will lose the performance benefits (caching, minification, global edge network) and security protections (DDoS mitigation, WAF) that Cloudflare provides.

- This will also reveal your backend internal IP addresses. Anyone can find permanent logs of public IP addresses used by even obscure domain names, so potential adversaries don't necessarily have to be paying attention at the exact right time to find it.

  • Also, for anyone who only has an old global API key lying around instead of the more recent tokens, you can set:

      -H "X-Auth-Email: $EMAIL_ADDRESS" -H "X-Auth-Key: $API_KEY"
    

    instead of the Bearer token header.

    Edit: and in case you're like me and thought it would be clever to block all non-Cloudflare traffic hitting your origin... remember to disable that.

  • Awesome! I did it via the Terraform provider, but for anyone else without access to the dashboard this is great. Thank you!

  • This is exactly what we've decided we should do next time. Unfortunately we didn't generate an API token so we are sitting twiddling our thumbs.

    Edit: seems like we are back online!

    • Took me ~30 minutes but eventually I was able to log in, get past the 2FA screen and change a DNS record.

      I surely missed a valid API token today.

      2 replies →

    • Im able to generate keys right now through warp. Login takes forever but it is working.

  • If anyone needs the internet to work again (or to get into your cf dashboard to generate API keys), if you have Cloudflare WARP installed, turning it on appears to fix otherwise broken sites. Maybe using 1.1.1.1 does too, but flipping the radio box was faster. Some parts of sites are still down, even after tunneling into to CF.

How did we get to a place where either Cloudflare or AWS having an outage means a large part of the web going down? This centralization is very worrying.

  • Because no one cares enough, including users.

    Oddly this centralization allows a complete deferral of blame without you even doing anything: if you’re down, that’s bad. But if you’re down, Spotify is down, social media is down… then “the internet is broken” and you don’t look so bad.

    It also reduces your incentive to change, if “the internet is down” people will put down their device and do something else. Even if your web site is up they’ll assume it isn’t.

    I’m not saying this is a good thing but I’m simply being realistic about why we ended up where we are.

    • As a user I do care, because I waste so much time on Cloudflare's "prove you are human" blocking-page (why do I have to prove it over and over again?), and frequently run on websites blocking me entirely based on some bad IP-blacklist used along with Cloudflare.

      8 replies →

    • Users have no options because... everything has been centralized. So it doesn't matter if users care or not.

      Users are never a consideration today anyway.

      4 replies →

    • > But if you’re down, Spotify is down, social media is down… then “the internet is broken” and you don’t look so bad.

      In my direct experience, this isn't true if you're running something even vaguely mission-critical for your customers. Your customer's workers just know that they can't do their job for the day, and your customer's management just knows that the solution they shepherded through their organization is failing.

    • Which "user" are you referring to? Cloudflare users or end product users?

      End product users have no power, they can complain to support and maybe get a free month of service, but the 0.1% of customers that do that aren't going to turn the tide and have anything change.

      Engineering teams using these services also get "covered" by them - they can finger point and say "everyone else was down too."

    • Many people care, but none of them can (sufficiently) change the underlying incentive structure to effect the necessary changes.

    • > It also reduces your incentive to change, if “the internet is down” people will put down their device and do something else. Even if your web site is up they’ll assume it isn’t.

      I agree. When people talk about the enshittification of the internet, Cloudflare plays a significant role.

    • This is essentially the entire IT excuse for going to anything cloud. I see IT engineers all the time justifying that the downtime stops being their problem and they stop being to blame for it. There's zero personal responsibility in trying to preserve service, because it isn't "their problem" anymore. Anyone who thinks the cloud makes service more reliable is absolutely kidding themselves, because everyone who made the decision to go that way already knows it isn't true, it just won't be their problem to fix it.

      If anyone in the industry actually cared about reliability and took personal stake in their system being up, everyone would be back on-prem.

      11 replies →

    • > Because no one cares enough, including users.

      When have users been asked about anything?

    • > Because no one cares enough, including users.

      this is like a bad motivational speaker talk.. heavy exhortations with a dramatic lack of actual reasoning.

      Systems are difficult, people. It is "incentives" of parties and lockin by tech design and vendors, not lack of individual effort.

    • More like "don't have choice". It's not like service provider gonna go to competition, because before you switch, it will be back.

      Frankly it's a blessing, always being able to blame the cloud that management forced company to migrate to be "cheaper" (which half of the time turns out to be false anyway)

    • But Spotify was not down. One social media was down.

      This:

      > if you’re down, that’s bad. But if you’re down, Spotify is down, social media is down… then “the internet is broken” and you don’t look so bad.

      is just marketing. If you are down with some other websites it is still bad.

      2 replies →

    • Eh? It's because they are offering a service too good to refuse.

      The internet this day is fucking dangerous and murderous as hell. We need Cloudflare just to keep services up due to the deluge of AI data scrapers and other garbage.

  • Many reasons but DDoS protection has massive network effects. The more customers you have (and therefore bandwidth provision) the easier it is to hold up against a DDoS, as DDoS are targeting just one (usually) customer.

    So there are massive economies of scale. Small CDN with (say) 10,000 customers and 10mbit/sec per customer can handle 100gbit/s DDoS (way too simplistic, but hopefully you get the idea) - way too small.

    If you have the same traffic provisioned on average per customer and have 1 million customers, you can handle a DDoS 100x the size.

    Only way to compete with this is to massively overprovision bandwidth per customer (which is expensive, as those customers won't pay more just for you to have more redundancy because you are smaller).

    In a way (like many things in infrastructure) CDNs are natural monopolies. The bigger you get -> the more bandwidth and PoP you can have -> more attractive to more customers (this repeats over and over).

    It was probably very astute of Cloudflare to realise that offering such a generous free plan was a key step in this.

    • Your argument is technically flawed.

      In a CDN, customers consume bandwidth; they do not contribute it. If Cloudflare adds 1 million free customers, they do not magically acquire 1 million extra pipes to the internet backbone. They acquire 1 million new liabilities that require more infrastructure investment.

      All you are doing is echoing their pitch book. Of course they want to skim their share of the pie.

      2 replies →

    • And how many companies want to also be able to build out their own CDN?

      Not every company can be an expert at everything.

      But perhaps many of us could buy a different CDN than the major players if we want to reduce the likelihood of mass outages like this though.

    • In my opinion, DDoS is possible only because there is no network protocol for a host to control traffic filtering on upstream providers (deny traffic from certain subnets or countries). In this case everybody would prefer write their own systems rather than rely on a harmful monopoly.

      12 replies →

  • Yeah, I went to HN after the third web page didn't work. I am not just worried about the single point of failure, I am much more worried about this centralization eventually shaping the future standards of the web and making it de facto impossible to self-host anything.

    Well that and the fact that when 99% goes through a central party, then that central party will be very interesting for authoritarian governments to apply sweeping censorship rules to.

    • It is already nearly impossible/very expensive in my country to be able to get a public IP address (Even IPv6) which you could host on. World is heavily moving towards centrally dependant on these big Cloud providers.

      1 reply →

    • > eventually shaping the future standards of the web and making it de facto impossible to self-host anything

      Eventually?

  • This might sound crazy as a software engineer, but I actually like the occasional "snow day" where everything goes down. It's healthy for us to all disconnect from the internet for a bit. The centralization unintentionally helps facilitate that. At least, that's my glass half full perspective.

    • I can understand that sentiment. Just don't lose sight of the impact it can have on every day people. My wife and I own a small theatre and we sell tickets through Eventbrite. It's not my full time job but it is hers. Eventbrite sent out an email this morning letting us know that they are impacted by the outage. Our event page appears to be working but I do wonder if it's impacting ticket sales for this weekend's shows.

      So while us in tech might like a "snow day", there are millions of small businesses and people trying to go about their day to day lives who get cut off because of someone else's fuck-ups when this happens.

      1 reply →

    • If the internet was just social media, SaaS productivity suites, and AI slop, sure...

      But there are systems that depend on Cloudflare, directly or not, and when they go down it can have a serious impact on somebody's livelihood.

    • I'm guessing you're employed and your salary is guaranteed regardless. Would you have the same outlook if you were the self-employed founder of an online business and every minute of outage was costing you money?

      7 replies →

  • Mostly since the AWS craze started a decade ago, developers have gone away from Dedicated servers (which are actually cheaper, go figure), which is causing all this mess.

    It's genuinely insane that many companies are designing a great amount of fallbacks... on the software level but almost none is thought on the hardware/infrastructure level, common-sense dictate that you should never host everything on a single provider.

    • I tried as hard as I could to stay self hosted (and my backend is, still), but getting constant DDoS attacks and not having the time to deal with fighting them 2-3x a month was what ultimately forced me to Cloudflare. It's still worse than before even with their layers of protection, and now I get to watch my site be down a while, with no ability to switch DNS to point back to my own proxy layer, since CF is down :/

      5 replies →

    • With the state of constant attack from AI scrapers and DDOS bots, you pretty much need to have a CDN from someone now, if you have a serious business service. The poor guys with single prem boxes with static HTML can /maybe/ weather some of this storm alone but not everything.

      2 replies →

    • > developers have gone away from Dedicated servers (which are actually cheaper, go figure)

      It depends on how you calculate your cost. If you only include the physical infrastructure having a dedicated server is cheaper. But by having some dedicated server you loose a lot of flexibility. Needs more resources? Just scale up your ec2, and with a dedicated server there is a lot more work involved.

      Do you want a 'production-ready' database? With AWS you can just click a few buttons and have a RDS ready to use. To roll out your own PG installation you need someone with a lot of knowledge(how to configure replication? backups? updates? ...).

      So if you include salaries in the calculation the result changes a lot. And even if you already have some experts in your payroll by putting them to work in deploying a PG instance you won't be able to use them to build other things that may generate more value to you business than the premium you pay to AWS.

    • I like the idea of having my own rack in a data center somewhere (or sharing the rack, whatever) but even a tiny cost is still more than free. And even then, that data center will also have outages, with none of the benefits of a Cloudflare Pages, GitHub Pages, etc.

    • I self hosted on one of the company’s servers back in the late 90s. Hard drive crashes (and a hack once, through an Apache bug) had our services (http, pop, smtp, nfs, smb, etc ) down for at least 2-3 days (full reinstall, reconfiguration, etc).

      Then, with regular VPSs I also had systems down for 1-2 days. Just last week the company that hosts NextCloud for us was down the whole weekend (from Friday evening) and we couldn’t get their attention until Monday.

      So far these huge outages that last 2-5 hours are still lower impact for me, and require me to take less action.

      1 reply →

    • Cloud-Hoster are that hardware-fallback. They started with offering better redundancy and scaling than your homemade breadbox. But it seems they lost something along the way and now we have this.

    • Maintainance cost is the main issue for on-prem infra, nowadays add things like DDOS protection and/or scraping protection, which can require dedicated team or for your company to rely on some library or open source project that is not guaranteed to be maintained forever (unless you give them support, which i believe in)... Yeah I can understand why companies shift off of on-prem nowadays

    • ... dedis are cheaper if you are rightsized. If you are wrongsize they just plain crash and you may or may not be able to afford the upgrade.

      I was at Softlayer before I was at AWS and what catalyzed the move was the time I needed to add another hard drive to a system and somehow they screwed it up. I couldn't put a trouble ticket it to get it fixed because my database record in their trouble ticket system was corrupted. The next day I moved my stuff to AWS and the day after that they had a top sales guy talk to me to try to get me to stay but it was too late.

  • They're using cloudfare for multicloud, but still have cloudfare as a single point of failure. Should make a cloudfare for cloudfare to solve this.

  • Now that network effects and data lock-in have taken root, downtime is not as big of a concern as it was in the 2000s

    • except, yknow, where peoples lives and livelihoods depend on access to information/being able to do things on exact time. aws and cloudflare are disqualifying themselves from hospitals and military and whatnot.

      1 reply →

  • Because it's better to have a really convenient and cheap service that works 99% of the time, than a resilient that is more expensive or more cumbersome to use.

    It's like github vs whatever else you can do with git that is truly decentralized. The centralization has such massive benefits that I'm very happy to pay the price of "when it's down I can't work".

  • How did we get to a place where Cloudflare being down means we see an outage page, but on that page it tells us explicitly that the host we're trying to connect to is up, and it's just a Cloudflare problem.

    If it can tell us that the host is up, surely it can just bypass itself to route traffic.

  • This was always the case. There was always a "us-east" in some capacity, under Equinix, etc. Except it used to be the only "zone," which is why the internet is still so brittle despite having multiple zones. People need to build out support for different zones. Old habits die hard, I guess.

  • I would be less worried if Cloudflare and AWS weren't involved in many more things than simply running DNS.

    AWS - someone touches DynamoDB and it kills the DNS.

    Cloudflare - someone touches functionality completely unrelated to DNS hosting and proxying and, naturally, it kills the DNS.

    There is this critical infrastructure that just becomes one small part of a wider product offering, worked on by many hands, and this critical infrastructure gets taken down by what is essentially a side-effect.

    It's a strong argument to move to providers that just do one thing and do it well.

  • Most developers don't care to know how the underlying infrastructure works (or why) and so they take whatever the public consensus is re: infra as a statement of fact (for the better part of the last 15 years or so that was "just use the cloud"). A shocking amount of technical decisions are socially, not technically enforced.

  • We take the idea of the internet always being on for granted. Most people don’t understand the stack and assume that when sites go down it’s isolated, and although I agree with you, it’s just as much complacency and lack of oversight and enforcement delays in bureaucracy as it is centralization. But I guess that’s kind of the umbrella to those things… lol

  • This topic is raised every time there is an outage with cloudflare and the truth of the matter is, they offer an incredible service, there is not a bit enough competition to deal with it. By definition their services are so good BECAUSE their adoption rate is so high.

    It's very frustrating of course, and it's the nature of the beast.

  • IMO, centralization is inevitable because the fundamental forces drive things in that direction. Clouds are useful for a variety of reasons (technical, time to market, economic), so developers want to use them. But clouds are expensive to build and operate, so there are only a few organizations with the budget and competency to do it well. So, as the market matures you end up with 3 to 5 major cloud operators per region, with another handful of smaller specialists. And that’s just the way it works. Fighting against that is to completely swim upstream with every market force in opposition.

  • > How did we get to a place where either Cloudflare or AWS having an outage means a large part of the web going down?

    As always, in the name of "security". When are we going to learn that anything done, either by the government or by a corporation, in the name of security is always bad for the average person?

  • Well the centralisation without rapid recovery and practices that provide substantial resiliency… that would be worrying.

    But I dare say the folks at these organisations take these matters incredibly seriously and the centralisation problem is largely one of risk efficiency.

    I think there is no excuse, however, to not have multi region on state, and pilot light architectures just in case.

  • Compliance. If you wanna sell your SAAS to big corpo, their compliance teams will feel you know what you're doing if they read AWS or Cloudflare on your architecture, even if you do not quite know what you're doing.

  • Currently at the public library and I can't use the customer inventory terminals to search for books. They're just a web browser interface to the public facing website, and it's hosted behind CF. Bananas.

  • Don't forget the CloudStrike outage: One company had a bug that brought down almost everything. Who would have thought there are so many single points of failure across the entire Internet.

  • It's because single points of traffic concentration are the most surveillable architecture, so FVEY et al economically reward with one hand those companies who would build the architecture they want to surveil with the other hand.

  • Except businesses love it.

    A lot (and I mean a lot) of people in IT like centralization specifically because it’s hard to blame people for doing something that everyone else is doing.

    • And HN users love it too. I've had people on this site say how great it is that their system routes 30% of traffic on the internet.

      I'd be horrified. That's not the internet or computing industries I grew up with, or started working in.

      But as long as the SPY keeps hitting > 10% returns each year, everyone's happy.

  • Don't think there is anything wrong with a centralised service being down, you just make a conscious decision if you want that and can afford that?

    People not being ready for cloudflare/[insert hyperscaler] to be possibly down is the only fault.

  • The technical term for it is a man in the middle. It’s better to call it what it is that way you aren’t fooled into thinking it’s not, because it is.

  • And all of these outages happening not long after most of them dismissed a large amount of experienced staff while moving jobs offshore to save in labor costs.

  • Re: Cloudflare it is because developers actively pushed "just use Cloudflare" again and again and again.

    It has been dead to me since the SSL cache vulnerability thing and the arrogance with which senior people expected others to solve their problems.

    But consider how many people still do stupid things like use the default CDN offered by some third party library, or use google fonts directly; people are lazy and don't care.

  • because efficiency trumps redundancy in the short term, which is all that matters in a super competitive environment.

  • Is avoiding single point of failure in anyone’s playbook? ¯\_(ツ)_/¯

    • We only care about it when it's time to complain about the work of individual people.

      Companies can always do as they please and people will rationalize anything.

  • 5 mins. of thought to figure out why these services exist?

    Dialogue about mitigations/solutions? Alternative services? High availability strategies?

    Nah! It's free to complain.

    Me personally, I'd say those companies do a phenomenal job by being a de facto backbone of the modern web. Also Cloudflare, in particular, gives me a lot of things for free.

  • because cloudfare protection blah blah, until cloudfare is down itself and then you are back to "who watches the watchmen"

  • Hacking software or hardware is so old school.

    The target these days is the user.

    The make-believe worm.

A colleague of mine just came bursting through my office door in a panic, thinking he brought our site down since this happened just as he made some changes to our Cloudflare config. He was pretty relieved to see this post.

  • Tell him it's worse than he thinks. He obviously brought the entire Cloudflare system down.

    • You joke and I think its funny, but as a junior engineer I would be quite proud if some small change I made was able to take down the mighty Cloudflare.

      10 replies →

  • Well, you can never be sure that he didn't:

    https://www.fastly.com/blog/summary-of-june-8-outage

    • It's also what was the cause of the Azure Front Doors global outage two weeks ago - https://aka.ms/air/YKYN-BWZ

      "A specific sequence of customer configuration changes, performed across two different control plane build versions, resulted in incompatible customer configuration metadata being generated. These customer configuration changes themselves were valid and non-malicious – however they produced metadata that, when deployed to edge site servers, exposed a latent bug in the data plane. This incompatibility triggered a crash during asynchronous processing within the data plane service. This defect escaped detection due to a gap in our pre-production validation, since not all features are validated across different control plane build versions."

    • > May 12, we began a software deployment that introduced a bug that could be triggered by a specific customer configuration under specific circumstances.

      I'd love to know more about what those specific circumstances were!

    • I'm pretty sure I crashed Gmail using something weird in its filters. It was a few years ago. Every time I did something specific (I don't remember what), it would freeze and then display a 502 error for a while.

  • Is there a word for that feeling of relief when someone else fucked up after initially thinking it was you?

    • What’s funny is as I get older this feeling of relief turns more like a feeling of dread. The nice thing about problems that you cause is that you have considerable autonomy to fix them. Cloudflare goes down you’re sitting and waiting for a 3 party to fix something.

      8 replies →

    • The problem is, I still get the wrong end of the stick when AWS or CF go down! Management doesn't care, understandably. They just want the money to keep coming in. It's hard to convince them that this is a pretty big problem. The only thing that will calm them down a bit is to tell them Twitter is also down. If that doesn't get them, I say ChatGPT is also down. Now NOBODY will get any work done! lol.

      5 replies →

    • When I'm debugging something, I'm not usually looking for the solution to the problem; I'm looking for sufficient evidence that I didn't cause the problem. Once I have that, the velocity at which I work slows down

      1 reply →

    • Maybe this isn’t great, but I get a hint of that feeling when I’m on an airplane and hear a baby crying. For a number of years, if I heard a baby crying, it was probably my baby and I had to deal with it. But now my kids are past that phase, so when I hear the crying, after that initial jolt of panic I realize that it isn’t my problem, and that does give me the warm fuzzies. Even though I do feel bad for the baby and their parents.

      3 replies →

  • I woke up getting bombarded by multiple clients messages of sites not working, I shitted my pants because I've changed the config just yesterday. When I saw the status message "cloudflare down" I was so relieved.

  • Good that he worked it out so quick. I recently spent a day debugging email problems on Railway PaaS, because they silently closed an SMTP port without telling anyone.

  • You missed a great opportunity to dead-pan him with something like "No, Bob, not just our site, you brought down the entire Internet, look at this post!"

  • Chances are still good that somewhere within Cloudflare someone really did do a global configuration push that brought down the internet.

    When aliens study humans from this period, their book of fairy tales will include several where a terrible evil was triggered by a config push.

  • Wait for the post mortem ... It is a technical possibility, race condition propagates one customer config to all nodes... :-)

Pretty much everything is down (checking from the Netherlands). The Cloudflare dashboard itself is experiencing an outage as well.

Not-so-funny thing is that the Betterstack dashboard is down but our status page hosted by Betterstack is up, and we can't access the dashboard to create an incident and let our customers know what's going on.

Edit: wording.

  • Yep that's also my experience. Except HN because it does not use *** Cloudflare because it knows it is not necessary. I just wrote a blog titled "Do Not Put Your Site Behind Cloudflare if You Don't Need To" [1].

    [1]: https://huijzer.xyz/posts/123/

    • Sadly, AI bots and crawlers have made CF the only affordable way to actually keep my sites up without incurring excessive image serving costs.

      Those TikTok AI crawlers were destroying some of my sites.

      Millions of images served to ByteSpider bots, over and over again. They wouldn't stop. It was relentless abuse. :-(

      Now I've just blocked them all with CF.

      12 replies →

    • Yes, I never understand this obsession for centralized services like Cloudflare. To be fair though, if our tiny blogs anyway had a hundred or so visitors monthly, does it matter if it had an outage for a day?

      6 replies →

    • Last time I tried this I got DDoS'd so I don't see a reason to step away from CF. That said, this is the price I pay

    • ~~two~~ three comments on that:

      1. DDOS protection is not the only thing anymore, I use cloudflare because of vast amounts of AI bots from thousands of ASNs around the world crawling my CI servers (bloated Java VMs on very undersized hosts) and bringing them down (granted, I threw cloudflare onto my static sites as well which was not really necessary, I just liked their analytics UX)

      2. the XKCD comic is mis-interpreted there, that little block is small because it's a "small open source project run by one person", cloudflare is the opposite of that

      3. edit: also cloudflare is awesome if you are migrating hosts, did a migration this past month, you point cloudflare to the new servers and it's instant DNS propagation (since you didnt propagate anything :) )

      2 replies →

  • It’s that time of the year again where we all realize that relying on AWS and Cloudflare to this degree is pretty dangerous but then again it’s difficult to switch at this point.

    If there is a slight positive note to all this, then it is that these outages are so large that customers usually seem to be quite understanding.

    • Unless you’re say at airport trying to file a luggage claim … or at the pharmacy trying to get your prescription. I think as a community we have a responsibility to do better than this.

      4 replies →

    • > If there is a slight positive note to all this, then it is that these outages are so large that customers usually seem to be quite understanding.

      Which only shows that chasing five 9s is worthless for almost all web products. The idea is that by relying on AWS or Cloudflare you can push your uptime numbers up to that standard, but these companies themselves are having such frequent outages that customers themselves don't expect that kind reliability from web products.

    • If I choose AWS/cloudflare and we're down with half of the internet, then I don't even need to explain it to my boss' bosses, because there will be an article in the mainstream media.

      If I choose something else, we're down, and our competitors aren't, then my overlords will start asking a lot of questions.

      6 replies →

    • Happy to hear anyone's suggestions about where else to go or what else to do in regards to protecting from large-scale volumetric DDoS attacks. Pretty much every CDN provider nowadays has stacked up enough capacity to tank these kind of attacks, good luck trying to combat these yourself these days?

      22 replies →

  • Cloudflare dashboard is down-ish, not totally down. If you're persistent you can turn off the turnstile and proxy.

    It took a few minutes but I got https://hcker.news off of it.

    • I can't sign in since Turnstile is down so I can't complete the captcha to log in.

      I also can't log in via Google SSO since Cloudflare's SSO service is down.

      1 reply →

    • I'm already logged in on the cloudflare dashboard and trying to disable the CF proxy, but getting "404 | Either this page does not exist, or you do not have permission to access it" when trying to access the DNS configuration page.

      1 reply →

    • Not saying not to do this to get through, but just as an observation, it’s also the sort of thing that can make these issues a nightmare to remediate, since the outage can actually draw more traffic just as things are warming up, from customers desperate to get through.

      But then, that’s what Cloudflare signed up to be.

  • Could always just use a status page that updates itself. For my side project Total Real Returns [1], if you scroll down and look at the page footer, I have a live status/uptime widget [2] (just an <img> tag, no JS) which links to an externally-hosted status page [3]. Obviously not critical for a side project, but kind of neat, and was fun to build. :)

    [1] https://totalrealreturns.com/

    [2] https://status.heyoncall.com/svg/uptime/zCFGfCmjJN6XBX0pACYY...

    [3] https://status.heyoncall.com/o/zCFGfCmjJN6XBX0pACYY

    • This is unrelated to the cloudflare incident but thanks a lot for making that page. I keep checking it from time to time and it's basically the main data source for my long term investing.

      1 reply →

  • I think there is a big business opportunity here. Make a site that let companies put their status update on local vps for $100.

  • Same here. We’re using OhDear. The status page is available but I can’t post an incident because their service is also behind Cloudflare.

    • Co-founder here, we'll be working on better ways to handle this over the coming days.

      Update: our app is available again without Cloudflare, you'll be able to post updates to status pages smoothly again.

  • BetterStack did report issues with some of their services, but they were not very informative.

  • All my stuff is working. Things on GCP. Things on Fly.io. Tooling I use.

    "Only" 10% of the internet is behind Cloudflare so far ;)

    • Happy for you :)

      I am curious about these two things:

      1- Does GCP also have any outages recently similar to AWS, Azure or CF? If a similar size (14 TB?) DDoS were to hit GCP, would it stand or would it fail?

      2- If this DDoS was targeting Fly.io, would it stand? :)

      2 replies →

  • When its back up, do yourself a favour and rent a $5/mo vps in another country from a provider like OVH or Hetzner and stick your status page on that.

    "Yes but what if they go down" - it doesnt matter, having it hosted by someone who can be down for the same reason as your main product/service is a recipe for disaster.

  • Seems like workers are less affected and maybe betterstack has decided to bypass cloudflare "stuff" for the status pages? (maybe to cut down costs). My site is still up though some GitHub runners did show it failed at certain points.

  • I don't get why you need such a service for a status page with 99.whatever% uptime. I mean, your status page only has to be up if everything else is down, so maybe 1% uptime is fine.

    /s

Classic. I see issues. Vendor’s status page is all green. Go to HN to find the confirmation. Applies to AWS, GH, everyone.

Edit: beautiful, this decentralised design of the internet.

  • I get the feeling that all "serious" businesses have manual processes for publicly facing status pages, for political reasons.

    I don't like it.

    • I’ve written before on HN about when my employer hired several ex-FAANG people to manage all things cloud in our company.

      Whenever there was an outage they would put up a fight against anyone wanting to update the status page to show the outage. They had so many excuses and reasons not to.

      Eventually we figured out that they were planning to use the uptime figures for requesting raises and promos as they did at their FAANG employer, so anything that reduced that uptime number was to be avoided at all costs.

      3 replies →

    • It's because if you automate it, something could/would happen to the little script that defines "uptime," so if it goes down, suddenly you're in violation of your SLA and all of your customers start demanding refunds/credits/etc. when everything is running fine.

      Plus, if you are allowed 45 minutes of downtime per year and it takes you an hour to manually update the status page, you just bought yourself an extra hour to figure out how to fix the problem before you have to start issuing refunds/credits

  • I usually get notifications from the sales/CS team way before the status page/incident list has any blip. This time was not an exception

There's something maliciously satisfying about seeing your own self-hosted stuff working while things behind Cloudflare or AWS are broken. Sure, they have like four more nines that me, but right now I'm sitting pretty.

  • This is a real problem for some some “old-school enterprise” companies that use Oracle, SAP, etc. along with the new AWS/CF based services. They are all waiting around for new apps to come back up while their Oracle suite/SAP are still functioning. There is a lesson here for some of these new companies selling to old-school companies.

  • I was just able to save a proxied site. Then the dashboard went down again. I didn't even know it was still on. It's really not doing anything for performance because the traffic is quite low.

  • How do you deal with DNS? I'm hosting something on a Raspberry Pi at home, and I had recently moved the DNS to Cloudflare. It's quite funny seeing my small personal website being down, although quite satisfying seeing both the browser and host with a green tick while Cloudflare is down.

    • > How do you deal with DNS?

      DNS is actually one of the easiest services to self-host, and it's fairly tolerant of downtime due to caching. If you want redundancy/geographical distribution, Hurricane Electric has a free secondary/slave DNS service [0] where they'll automatically mirror your primary/master DNS server.

      [0]: https://dns.he.net/

    • I don't have experience with a dynDNS setup like you describe, hosting from (probably) home. But my domains are on a VPS (and a few other places here and there) and DNS is done via my domain reseller's DNS settings pages.

      Never had an issue hosting my stuff, but as said - don't yet have experience hoting something from home with a more dynamic DNS setup.

  • just a couple of days ago, I've moved my self hosted stuff from Cloudflare :)

Is it me or has there been a very noticeable uptick in large scale infra-level outages lately? AWS, Cloudflare, etc have all been way under whatever SLA they publish.

  • Coincidentally, large tech companies have been conducting mass layoffs and claim they're going to rely on AI much more to replace junior developers.

    • That does seem to be a coincidence, as the recent outages making headlines (including this one according to early reports) have been associated with huge traffic spikes. It seems DDoS are reaching a new level.

  • For me the only silver lining to all these cloud outages is now we know that their published SLA times mean absolutely nothing. The number of 9's used to at least give an indication of intent of reliability, now they are twisted to whatever metric the company wants to represent and dont actually represent guaranteed uptime anywhere.

  • Some of the other commenters here have posited a "vibe code theory". As the amount of vibe code in production increases, so does the number of bugs and, therefore, the number of outages.

    • None of the recent major outages were traced down to "vibe coding" or anything of the sort. They appear to be the kind of misconfigurations and networking fuckups that existed since Internet became more complex than 3 routers.

      6 replies →

    • Speaking of "vibe-coding", I wonder how much their own outage is affecting their ability to vibe-code their way out of it.. :-)

      The openai login page says:

          Please unblock challenges.cloudflare.com to proceed.

    • > Some of the other commenters here have posited a "vibe code theory". As the amount of vibe code in production increases, so does the number of bugs and, therefore, the number of outages.

      Likely this coupled with the mass brain damage caused by never-ending COVID re-infections.

      Since vaccines don't prevent transmission, and each re-infection increases the chances of long COVID complications, the only real protection right now is wearing a proper respirator everywhere you go, and basically nobody is doing that anymore.

  • The theory I’ve heard is holiday deploy freezes coupled with Q4 goals creates pressure to get things in quickly and early. It’s all been in the last month or so which does line up.

  • My theory is a state-sponsored actor targeting some of these services, but maybe that's just too 'tinfoil hat' of me, who knows.

    • There are usually very comprehensive post mortems for these events, and none have suggested that at all

    • This only amplifies the often-repeated propaganda about the "very powerful" enemies of democracy, who in fact are very fragile dictatorships. There's enough incompetence at tech companies to f up their own stuff.

  • If it's any guidance, US cyber risk insurance (which covers among other things disruptions due to supplier outages) has continuously dropped in price since Q1 2023, with a handful of percent per year.

    If you excuse the sloppy plot manually transcribed from market index data: https://i.xkqr.org/cyberinsurancecost.png

  • I suspect the number of outages is the same, but the number of sites putting all of their eggs into these two baskets has grown considerably.

Ironically, DownDetector seems to be down because it protects its site with Cloudflare Turnstile... which is also down!

  • The report there for AWS also skyrocketed, but I guess it's probably false positives?

    • Even many non tech people have begun to associate Internet wide outages with “aws must be down” so I imagine many of them searching “is aws down” and for down detector, a hit is a down report, so it will report aws impacts even when the culprit is cloudflare in this case

      1 reply →

I do appreciate the visual "mea culpa":

Your browser: Working

Host: Working

Cloudflare: Error

  • Might be the first time I have ever seen that. Though in my case the "Host" is Cloudflare's own Pages service.

    • Yeah, I was shocked. Disbelief that the host was up, which is what usually happens when the cloudflare's page show up

  • They still blame the customers when you click on "Cloudflare":

    > If the problem isn’t resolved in the next few minutes, it’s most likely an issue with the web server you were trying to reach.

    • In terms of probability looking at the history, it is correct. It's mostly me messing up with the web server.

  • I noticed that refreshing honesty too, not that the users did (our wifi is down fix it pls urgent)

  • That is really good to be honest!

    I have Cloudflare running in production and it is affecting us right now. But at least I know what is going on and how I can mitigate (e.g. disable Cloudflare as a proxy if it keeps affecting our services at skeeled).

  • I searched my logs for errors for about an hour before figuring out the problem was not on my server :D

  • That page has special if/endif HTML comments to handle if your browser is IE 6, IE 7, IE 8...

I’d rather mitigate a DDoS attack on my own servers than deal with Cloudflare. Having to prove you’re human is the second-worst thing on my list, right after accepting cookies. Those two things alone have made browsing the web a worse experience than it was in the late 90s or early 2000s.

  • There's worse than having to prove (over and over and over again) that you are human: having your IP just completely blocked by Cloudflare zealous bot-filtering (and I use a plain mass market ISP in a developed country and not some shady network)

  • As much as this situation sucks, how do you plan to "mitigate a DDoS attack on my own servers". The reason I use Cloudflare is to use it as a proxy especially for DDOS attacks if they do occur. Right now, our services are down and we are getting tons of customer support tickets (like everyone else) but it is a lot easier to explain the the whole world is down vs its just us.

  • How do you plan on mitigating a DDoS on your own servers?

    • Worrying about a DDoS on your tiny setup is like a brand-new dev stressing over how they'll handle a billion requests per second...cute, but not exactly a real-world problem for 99.99% of you. It's one of those internet boogeyman myths people love to panic about.

    • Alright kids, breathe...a DDoS attack isn't the end of the world, it's just the internet throwing a tantrum. If you really don't want to use a fancy protection provider, you can still act like a grown-up: get your datacenter to filter trash at the edge, announce a more specific prefix with BGP so you can shift traffic, drop junk with strict ACLs, and turn on basic rate limiting so bots get bored. You can also tune your kernel so it doesn't faint at SYN storms, and if the firehose gets too big, pop out a more specific BGP prefix from a backup path or secondary router so you can pull production away from the burning IP.

Interestingly, also noticing that websites that use Cloudflare Challenge (aka "I'm not a Robot") are also throwing exceptions with a message as "Please unblock challenges.cloudflare.com to proceed" - even though it's just responding with an HTTP/500.

  • The state of error handling in general is woeful, they do anything to avoid admitting they're at fault so the negative screenshots don't end up on social media.

    Blame the user or just leave them at an infinite spinning circle of death.

    I check the network tab and find the backend is actually returning a reasonable error but the frontend just hides it.

    Most recent one was a form saying my email was already in use, when the actual backend error returned was that the password was too long.

  • This takes down AI/search on chat.bing.com (GPT5, unauthenticated).

    Funny, since I would have to prove to a an AI that I am human in the first place.

  • I think the site (front-end) thinks you have blocked the domain through DNS or an extension; and thus suggests you unblock it. It is unthinkable that Cloudflare captchas could go down /s.

Quote from The Guardian's story:

>A spokesperson for Cloudflare said: “We saw a spike in unusual traffic to one of Cloudflare’s services beginning at 11.20am. That caused some traffic passing through Cloudflare’s network to experience errors. While most traffic for most services continued to flow as normal, there were elevated errors across multiple Cloudflare services.

>“We do not yet know the cause of the spike in unusual traffic. We are all hands on deck to make sure all traffic is served without errors. After that, we will turn our attention to investigating the cause of the unusual spike in traffic.”

https://www.theguardian.com/technology/2025/nov/18/cloudflar...

  • Sounds like it may have been a cyber attack...

    • "Unusual spike of traffic" can just be errant misconfiguration that causes traffic spikes just from TCP retries or the like. Jumping to "cyber attack" is eating up Hollywood drama.

      In most cases, it's just cloud services eating shit from a bug.

> During our attempts to remediate, we have disabled WARP [their VPN service] access in London. Users in London trying to access the Internet via WARP will see a failure to connect. Posted 4 minutes ago. Nov 18, 2025 - 13:04 UTC

Is Cloudflare being attacked...?

  • This line also gave me that vibe

    • > We have made changes that have allowed Cloudflare Access [their 'zero-trust network access solution'] and WARP to recover. Error levels for Access and WARP users have returned to pre-incident rates. > We have re-enabled WARP access in London.

      > We are continuing to work towards restoring other services. > Posted 12 minutes ago. Nov 18, 2025 - 13:13 UTC

      Now I'm really suspicious that they were attacked...

I didn’t see anyone comment this directly, but something these recent outages made me wonder, having spent a good chunk of my career in 24/7 tech support, is that I can’t even fathom the amount of people who have been:

- restarting their routers and computers instead of taking their morning shower, getting their morning coffee, taking their medication on time because they’re freaking out, etc. - calling ISPs in a furious mood not knowing it’s a service in the stack and not the provider’s fault (maybe) - being late for work in general - getting into arguments with friends and family and coworkers about politics and economics - being interrupted making their jerk chicken

One of the things that i didnt like about cloudflare MITM as a service is their requirement if you want SSL/CDN that you must use their DNS. Overconcentration of infra within one single pint of disruption with no easy outs when the stack tips over. Sadly i dont see any changes or rethink to be more decentralised even after this outage.

  • to be clear, that's just a limitation on their free service. If you pay, you can keep your own DNS

    • Yeah they keep re-inforcing bad vendor lockin practices. id guess the number of free users surpass the paying ones , and situations like these leave them all unable to recover.

I know this is bad, and some people's livelihood and lives rely on critical infrastructure, but when these things happen, I sometimes think GOOD!, let's all just take a breather for a minute yeh? Go outside.

I went to check how many services are being impacted on down detector, but it was down.

Interesting(unnerving?) to see a number of domain registrars that offer their own DNS services utilize at least some kind of Cloudflare service for at least their own web fronts. Did a check on 6 registrar sites I currently interact with and half were down(Namecheap/Spaceship, Name, Dynadot) and up(Porkbun, Gandi, GoDaddy).

  • I just considered moving from Namecheap to Porkbun as Namecheap is down, but Porkbun use Cloudflare for their CAPTCHA meaning I'm unable to signup and I assume log in as well, so also no good.

Your origin servers are protected now as no one can access them. Thanks for choosing CloudFlare's MITM "protection".

Tried checking Cloudflare’s status on Downdetector, but Downdetector was also behind Cloudflare. Internet checkmate.

  • It’s not just websites :-/

    Things like Apple private relay (which way too many people seem to have it enabled) are tunnelled via Cloudflare, maybe using warp?

What would the Internet's architecture have to look like for DDOS'ing to be a thing of the past, and therefore Cloudflare to not be needed?

I know there are solutions like IPFS out there for doing distributed/decentralised static content distribution, but that seems like only part of the problem. There are obviously more types of operation that occur via the network -- e.g. transactions with single remote pieces of equipment etc, which by their nature cannot be decentralised.

Anyone know of research out there into changing the way that packet-routing/switching works so that 'DDOS' just isn't a thing? Of course I appreciate there are a lot of things to get right in that!

  • What would that look like? A network with built-in rate & connection limiting?

    The closest thing I can think of is the Gemini protocol browser. It uses TOFU for authentication, which requires a human to initially validate every interaction.

  • Something like a mega-transnational-parent ISP authority and give tech giants LaLiga kind of power.

  • It's impossible to stop DDoS attacks because of the first "D".

    If a botnet gets access through 500k IP addresses belonging to home users around the world, there's no way you could have prepared yourself ahead of time.

    The only real solution is to drastically increase regulation around security updates for consumer hardware.

    • Maybe that's the case, but it seems like this conclusion is based on the current architecture of the internet. Maybe there are ways of changing it that mean these issues are not a thing!

  • Built it into the protocol that you must provide bandwidth in order to have your requests served. A bit like forcing people to seed torrents.

    • Works for static content and databases, but I don't think it works for applications where there is by necessity only one destination that can't be replicated (e.g. a door lock).

Can't even change my nameservers away from Cloudflare as Namecheap use Cloudflare!!

There was an article on HN a few days back about how companies like this are influencing the overall freedom of the web (I missed the source) and their own way of doing things. Other examples of influence I see similarly are of Vercel, like with enterprise. Even a few days back, we saw AWS.

And no lesson about single point of failure and centralization was learned that day.

Later today or tomorrow there's going to be a post on HN pointing to Cloudflare's RCA and multitudes here are going to praise CF for their transparency. Let's not forget that CF sucks and took half the internet down for four hours. Transparency or no, this should not be happening.

  • Alot of things shouldnt be happening. Fact is that no one forced half the internet to make CF their point of failure. The internet should ask themselves if that was the right call

The danger of Internet centralization in Cloudflare

  • That's why I run my server on 7100 chips made for me by Sam Zeloof in his garage on a software stack hand coded by me, on copper I ran personally to everyone's house.

I was shouting at network guy/colleague, how come challenges.cloudflare.com got blocked!! damn, I must apologise to him.

  • Even if he blocked it by accident, that is not a reason to shout.

    Shouting will not prevent errors, and you are only creating a hostile work environment where not acting is better than the risk of making a mistake and triggering an aggressive response from your part.

    • There is nothing else to do since CF is down... so.

      There is nothing wrong with shouting during a perceived outage. Shouting is just raising your voice to give a notion of urgency. Yelling is different.

      How often have you heard "shout at me", or something like that?

      OP, continue you to shout when its needed, just don't yell at people you work with ;)

      1 reply →

It's so crazy and scary that Cloudflare is the single point of failure for the internet.

  • But this decision is not determined by CF. It's how the devs decided.

    • Trying to figure out if this observation was intended to frame it so that it's less|same|more scary. The effect is more, but it sounds like the intention was less.

> A fix has been implemented and we believe the incident is now resolved. We are continuing to monitor for errors to ensure all services are back to normal. Posted 3 minutes ago. Nov 18, 2025 - 14:42 UTC

Seems like they think they've fixed it fully this time!

  • Close! They just updated their states and it's back to working on a fix

    Update - Some customers may be still experiencing issues logging into or using the Cloudflare dashboard. We are working on a fix to resolve this, and continuing to monitor for any further issues. Nov 18, 2025 - 14:57 UTC

> Cloudflare Global Network experiencing issues

> Investigating - Cloudflare is aware of, and investigating an issue which potentially impacts multiple customers. Further detail will be provided as more information becomes available.

Things are back up (a second time) for me.

Cloudflare have updated their status page now to reflect the problems now. It doesn’t sound like they are confident the problem is fully fixed yet.

Edit: and down again a third time!

It's knocked out Turnstile too, which means I can't even log in to my Cloudflare dash to bypass my site's proxying via Cloudflare.

More vibe code gets into production. AWS, Azure and Cloudflare all have major issues.

Coincidence? I think not.

This sentence is slowly getting boring after all those recent outages: My web app hosted on Hetzner and BunnyCDN still works.

That shows, the distributed nature of the internet is still there. It is a problem though, if everything is funneled through one provider.

I got several emails from some uptime monitors I setup due to failing checks on my website and funnily enough I cannot log into any of them.

BetterStack, InStatus and HetrixTools seemingly all use Cloudflare on their dashboards, which means I can't login but I keep getting "your website/API is down" emails.

Update: I also can't login to UptimeRobot and Pulsetic. Now, I am getting seriously concerned about the sheer degree of centralization we have for CDNs/login turnstiles on Cloudflare.

This reminds me that I really like self-hosting. While it is true that many of things do not work, all my services do work. It has some tradeoffs of course.

Phew, my latest 3h30 workshop about Obsidian was saved. I recorded it this morning, not knowing about the Cloudflare issue (probably started while I was busy). I'm using Circle.so and they're down (my community site is now inaccessible). Luckily, they probably use AWS S3 or similar to host their files, so that part is still up and running.

Meanwhile all my sites are down. I'll just wait this one out, it's not the end of the world for me.

My GitHub actions are also down for one of my project because some third-party deps go through Cloudflare (Vulkan SDK). Just yesterday I was thinking to myself: "I don't like this dependency on that URL...". Now I like it even less

How come HN is never down with all these outages?

ChatGPT is Down. What will LinkedIn posters ever do?

Everyone laughs when AWS collapses, everyone is silent when Cloudflare collapses. Why? Because the place to laugh has collapsed.

Funny that I could not load Twitter to see if Cloudflare was down.

I rushed to Hacker News, but it was too early. Clicking on “new” did the job to find this post before making it to the Homepage:)

The web is still alive!

Seems like ChatGPT and Claude are also affected. (CLI Codex still seems to work).

RIP to the engineers fixing this without any AI help.

  • For me right now, Claude.ai is down, but Claude Code (terminal, extension) seems to be up and happy. Suggests that API is probably up.

Speaking of 5 9s, how would you achieve 5 9s for a basic CRUD app that doesn't need to scale, but still be globally accessible? No auth, micro services, email or 3rd party services. Just a classic backend connected to a db (any db tech, hosted wherever), that serves up some html.

  • It depends on the infrastructure you're running on. There was a post yesterday going fairly into depth how you do such calculations https://authress.io/knowledge-base/articles/2025/11/01/how-w...

    You probably cannot achieve this with a single node, so you'll at least need to replicate it a few times to combat the normal 2-3 9s you get from a single node. But then you've got load balancers and dns, which can also serve as single point of failure, as seen with cloudflare.

    Depending on the database type and choice, it varies. If you've got a single node of postgres, you can likely never achieve more than 2-3 9s (aws guarantees 3 9s for a multi-az RDS). But if you do multi-master cockroach etc, you can maybe achieve 5 9s just on the database layer, or using spanner. But you'll basically need to have 5 9s which means quite a bit of redundancy in all the layers going to and from your app and data. The database and DNS being the most difficult.

    Reliable DNS provider with 5 9s of uptime guarantees -> multi-master load balancer each with 3 9s, -> each load balancer serving 3 or more apps each with 3 9s of availability, going to a database(s) with 5 9s.

    This page from google shows their uptime guarantees for big tables, 3 9s for a single region with a cluster. 4 9s for multi cluster and 5 9s for multi region

    https://docs.cloud.google.com/architecture/infra-reliability...

    In general it doesn't matter really what you're running, it is all about redundancy. Whether that is instances, cloud vendor, region, zone etc.

  • Part of the up-time solution is keeping as much of your app and infrastructure within your control, rather than being at the behest of mega-providers as we've witnessed in the past month: Cloudflare, and AWS.

    Probably:

    - a couple of tower servers, running Linux or FreeBSD, backed up by a UPS and an auto-run generator with 24 hours worth of diesel (depending on where you are, and the local areas propensity for natural disasters - maybe 72 hours),

    - Caddy for a reverse proxy, Apache for the web server, PostgreSQL for the database;

    - behind a router with sensible security settings, that also can load-balance between the two servers (for availability rather than scaling);

    - on static WAN IPs,

    - with dual redundant (different ISPs/network provider) WAN connections,

    - a regular and strictly followed patch and hardware maintenance cycle,

    - located in an area resistant to wildfire, civil unrest, and riverine or coastal flooding.

    I'd say that'd get you close to five 9s (no more than ~5 minutes downtime per year), though I'd pretty much guarantee five 9s (maybe even six 9s - no more than 32 seconds downtime per year) if the two machines were physically separated from each other by a few hundred kilometres, each with their own supporting infrastructure above, sans the load balancing (see below), through two separate network routes.

    Load balancing would become human-driven in this 'physically separate' example (cheaper, less complex): if your-site-1.com fails, simply re-point your browser to your-site-2.com which routes to the other redundant server on a different network.

    The hard part now will be picking network providers that don't use the same pipes/cables, i.e. they both use Cloudflare, or AWS...

    Keep the WAN IPs written down in case DNS fails.

    PostgreSQL can do master-master replication, but it's a pain to set up I understand.

  • what if you could create a super virtual server of sorts. imagine a new cloud provider like vercel but called something else. what this provider does is when you create a server on their service, they create 3 services, one on aws, one on gcp and one on azure. behind the scenes they are 3 separate servers but to the end user they are a single server. the end user gets to control how many cloud providers are involved. when aws goes down, no worries, it switches to the part with gcp on

Looking forward to seeing their RCA. I'm guessing it's going to be glossy in terms of actual customer impact. "We didn't go offline, we just had 100% errors. For 60 minutes."

Cloudflare seems to have degrated performance. Half the requests for my site throw cloudflare 500x errors, the other half work fine.

However the https://www.cloudflarestatus.com/ does not really mention anything relevant. What's the point of having a status page if it lies ?

Update Ah I just checked the status and now I get a big red warning (however the problem existed for like 15 minutes before 11:48 UTC):

> Investigating - Cloudflare is aware of, and investigating an issue which potentially impacts multiple customers. Further detail will be provided as more information becomes available. Nov 18, 2025 - 11:48 UTC

  • >However the https://www.cloudflarestatus.com/ does not really mention anything relevant. What's the point of having a status page if it lies ?

    What is the lie ?

    > Cloudflare Global Network experiencing issues

    cloudflare has a specific service names "Network" and it's having issues..

    • Please read my comment again including the update:

      For 15 minute cloudflare wasn't working and the status page did not mentioned anything. Yes, right now the status page mentions the serious network problem but for some time our pages were not working and we didn't know what was happening.

      So for ~ 15 minutes the status page lied. The whole point of a status page is to not lie, i.e to be updated automatically when there are problem and not by a person that needs to get clearance on what and how to write.

  • > What's the point of having a status page if it lies ?

    Status pages are basically marketing crap right now. The same thing happened with Azure where it took at least 45 minutes to show any change. They can't be trusted.

At some point we really need to think if this is the web we want, one/two major actors are down and everything goes with it

Not downplaying the immense work of infra / engineering at this scale but my neighborhood local grocery market shouldn’t be down

  • It's hard not to use Cloudflare at least for me: good products, "free" for small projects, and if Cloudflare is down no one will blame you since the internet is down.

  • There’s certainly a business case for “which nines” after the talk of n nines. You ideally want to be available when your competitor, for instance, is not.

  • Setting up a replica and then pointing your api requests at it when cloudflare request fails is trivial. This way if you have a SPA and as long as your site/app is open the users won't notice.

    The issue is DNS since DNS propagation takes time. Does anyone have any ideas here?

    • > Setting up a replica and then pointing your api requests at it when cloudflare request fails is trivial.

      Only if you're doing very basic proxy stuff. If you stack multiple features and maybe even start using workers, there may be no 1:1 alternatives to switch to. And definitely not trivially.

    • Two domains for your api perhaps, a full blown SPA could try one and then the other.

  • > At some point we really need to think if this is the web we want,

    You think we have a say in this?

    • You have the power to not host your own infrastructure on aws and behind cloudflare, or in the case of an employer you have the power to fight against the voices arguing for the unsustainable status quo.

      3 replies →

    • The HN crowd in particular absolutely has a say in this, given the amount of engineering leads, managers, and even just regular programmers/admins/etc that frequent here - all of whom contribute to making these decisions.

  • It's not the web we want, but it's the web corporations want. And everybody else doesn't give a damn.

  • We? I am not using it. I never used it and I will not use it. People should learn how to work with firewall, setup a simple ModSecurity WAF and stop using this bullshit. Almost everything goes through cloudflare and cloudflare also does TLS fronting for websites so basically cloudflare is MITM spying proxy but no one seem to care. :/

  • Think about this rationally. If Cloudflare doesn't fix it within reasonable time, you can just point to different name servers and have your problem fixed in minutes.

    So why be on Cloudflare to start with? Well, if you have a more reliable way then there's no reason. If you have a less reliable way, then you're on average better off with Cloudflare.

    • Well I can't change my NS since it's on Cloudflare too but besides that my personal opinion was not about this outage in particular but more the default approach of some websites that don't need all this tech (yes I really was out of groceries)

      4 replies →

  • BLOCKCHAINS! I mean, some sort of P2P hosting and/or node discovery would be nice.

  • Why everyone needs to be behind Cloudflare. I don't think DDOSing sites out of whim is so rampant that everyone needs the virtual umbrella.

    • It's the web-scrapers. I run a tiny little mom and pop website, and the bots were consistently using up all of my servers' resources. Cloudfare more or less instantly resolved it.

      3 replies →

    • I’ve been DDoS’d countless times running a small scale, uncontroversial SaaS. Without them I would’ve had countless downtime periods with really no other way to mitigate.

    • There's plenty of DDoS if you're dealing with people petty enough.

      The VPS I use will nuke your instance if you run a game server. Not due to resource usage, but because it attracts DDoS like nothing else. Ban a teen for being an asshole and expect your service to be down for a week. And there isn't really Cloudflare for independent game servers. There's Steam Networking but it requires the developer to support it and of course Steam.

      Valve's GDC talk about DDoS mitigation for games: https://youtu.be/2CQ1sxPppV4

    • It actually is.

      I run a small video game forum with posts going back to 2008. We got absolutely smashed by bots scraping for training data for LLMs.

      So I put it behind Cloudflare and now it's down. Ho hum.

      7 replies →

    • I was arrested by Interpol in 2018 because of warrants issued by the NCA, DOJ, FBI, J-CAT, and several other agencies, all due to my involvement in running a DDoS-for-hire website. Honestly, anyone can bypass Cloudflare, and anyone that want to take your website down - will take it down. It's just that luckily for all of us most of the DDoS-4-hire websites are down nowadays but there are still many botnets out there that will get past basically any protection and you can get access to them for basically $5.

      5 replies →

    • There are plenty of alternatives to protect against DDoSing, people like convenience though. “Nobody gets fired for choosing Microsoft/Cloudflare”. We have a culture problem

    • Honestly it kinda is. Ai bots scrape everything now, social media means you can go viral suddenly, or you make a post that angers someone and they launch an attack just because. I default to cloudflare, because like an umbrella I might just be carrying it around most of the time, but in the case of a sudden downpoor it's better than getting wet.

    • It's not super common, but common enough that I don't want to deal with it.

      The other part is just how convenient it is with CF. Easy to configure, plenty of power and cheap compared to the other big ones. If they made their dashboard and permission-system better (no easy way to tell what a token can do last I checked), I'd be even more of a fan.

      If Germany's Telekom was forced to peer on DE-CIX, I'd always use CF. Since they aren't and CF doesn't pay for peering, it's a hard choice for Germany but an easy one everywhere else.

Seriously, bookmarking this site and checking it first next time instead of disabling all my ad blockers.

Is there any way to remove every SPOF?

Currently I have multi-region loadbalanced servers. DNS and WAF (and the load balancer) on Cloudflare.

Moving DNS elsewhere is step 1 so I'm not locked out - but then I can't use Cloudflare full stop (without enterprise pricing).

Multi-provider DNS and WAF - okay I could see how that works.

But what about the global load balancer, surely that has to remain a single point of failure?

  • No? The point of cloudflare is that they remove the spof for you, but I guess we can say they didn't do it quite perfectly

ERROR [12:00:21 UTC]: CF_EDGE_ROUTING_FAILURE. Reason: Origin-Shield connectivity loss detected within multi-region fabric. BGP path withdrawal initiated for critical LCP clusters (LCP-LON, LCP-FRA). Status code 521/522 flood reported globally. Geo-location failover services degraded. DNS resolution timeout on 1.1.1.1/1.0.0.1. Traffic flow re-routing pending verification of internal control plane integrity.

For anyone reading this who desperately needs their website up, you can try this: If you manage to get to your Cloudflare DNS settings and disable the "Proxy status (Proxied)" feature (the orange cloud), it should start working again.

Be aware that this change has a few immediate implications:

- SSL/TLS: You will likely lose your Cloudflare-provided SSL certificate. Your site will only work if your origin server has its own valid certificate.

- Security & Performance: You will lose the performance benefits (caching, minification, global edge network) and security protections (DDoS mitigation, WAF) that Cloudflare provides.

  • This will also reveal your backend internal IP addresses. Anyone can find permanent logs of public IP addresses used by even obscure domain names, so potential adversaries don't necessarily have to be paying attention at the exact right time to find it.

  • Unfortunately, this will also expose your IP address, which may leave you vulnerable even when the WAF and DDoS protections come back up (unless you take the time to only listen for Cloudflare IP address ranges, which could still take a beefy server if you're having to filter large amounts of traffic).

  • also the API was working fine while the dash was down.

    if you don't have the keys make sure to grab them for the next one.

Our national transit agency is apparently a customer.

The departure tables are borked, showing incorrect data, the route map stopped updating, the website and route planner are down, and the API returns garbage. Despite everything, the management will be pleased to know the ads kept on running offline.

Why would you put a WAP between devices you control and your own infra, God knows.

  • > Why would you put a WAP between devices you control and your own infra

    Checkbox security says a WAP is required and no CISO will put their neck on the line to approve the exemption.

Our doctor's office can't make appointments because their "system is down."

The main bike rental Velib in Paris has the app not working, but the bikes can be taken with NFC. However, my station, which is always full at this time, is now empty, with only 2 bad bikes. It maybe related. Yet, push notifications are working.

I'm going to take the metro now and thinking how long do we have until the entire transit network goes down because of a similar incident.

It's been 15 minutes of it going up and down, still nothing on their status page...

Related to Azure DDoS?

https://news.ycombinator.com/item?id=45955900

What do we actually lose going from cloud back to ground?

The mass centralization is a massive attack vector for organized attempts to disrupt business in the west.

But we’re not doing anything about it because we’ve made a mountain at of a molehill. Was it that hard to manage everything locally?

I get that there’s plenty of security implications going that route, but it would be much harder to bring down t large portions of online business with a single attack.

  • > What do we actually lose going from cloud back to ground?

    A lot of money related to stuff you currently don't have to worry about.

    I remember how shit worked before AWS. People don't remember how costly and time consuming this stuff used to be. We had close to 50 people in our local ops team back in the day when I was working with Nokia 13 years ago. They had to deal with data center outages, expensive storage solutions failing, network links between data centers, offices, firewalls, self hosted Jira running out of memory, and a lot of other crap that I don't spend a lot of time about worrying with a cloud based setup. Just a short list of stuff that repeatedly was an issue. Nice when it worked. But nowhere near five nines of uptime.

    That ops team alone cost probably a few million per year in salaries alone. I knew some people in that team. Good solid people but it always seemed like a thankless and stressful job to me. Basically constant firefighting while getting people barking at you to just get stuff working. Later a lot of that stuff moved into AWS and things became a lot easier and the need for that team largely went away. The first few teams doing that caused a bit of controversy internally until management realized that those teams were saving money. Then that quickly turned around. And it wasn't like AWS was cheap. I worked in one of those teams. That entire ops team was replaced by 2-3 clued in devops people that were able to move a lot faster. Subsequent layoff rounds in Nokia hit internal IT and ops teams hard early on in the years leading up to the demise of the phone business.

    • Yeah, people have such short memories for this stuff. When we ran our own servers a couple of jobs ago, we had a rota of people who'd be on call for events like failing disks. I don't want to ever do that again.

      In general, I'm much happier with the current status of "it all works" or "it's ALL broken and its someone else's job to fix it as fast as possible"!

      Not saying its perfect but neither was on-prem/colocation

DigitalOcean + Gandi means nothing I run is down. Amazing. We depend far too greatly on centralised services where we deem the value of reputation and convenience exceeds the potential downsides and then the world pays for it. I think we have to feel a lot more of this pain before regulation kicks in to change things because the reality is people don't change. The only thing you can personally do is run a lot of your own stuff for things you can.

  • DigitalOcean is indeed having issues.

    > Application error: a client-side exception has occurred while loading www.digitalocean.com (see the browser console for more information).

    Yellow flags on status.digitalocean.com *

  • Individually, for you, what's the difference?

    You use a service provider, if that service provider is down, your site is down. Does it matter to you that others are also down in that instance?

    • Might even be better to go down at the same time as everyone else, because customers might be more lenient on you.

Recently my multiple VPN server nodes just randomly cannot connect to cloudflare CDN IPs, from different provider VPS, while the Host Linux network does not have the issue; vpp share the same address with Linux and use tc stateless NAT to do the trick.

I finally work around this by change the tcp options sent by vpp tcp stack.

But the whole thing made me worry there must be something deployed which cause this issue.

But I do not think that related with this network issue, it just reminds me the above, I feel there are frequently new articles about cloudflare networking, maybe new method or new deployment sort of related high probability of issues

This is worse than than the Amazon outage. I couldn't even login to Cloudflare.

Wow, with outage of a scale like this, it must be measurable as a loss in global GDP

I've been considering Cloudflare for caching, DDoS protection and WAF, but I don't like furthering the centralization of the Web. And my host (Vultr) has had fantastic uptime over the 10 years I've been on them.

How are others doing this? How is Hacker News hosted/protected?

It's interesting to see hacker news response time reaching almost 2 seconds for this post.

I got an email saying that my OpenAI auto-renewal failed, my credits have run out. I go to OpenAI to reauthorize the card, and I can't login because OpenAI uses Cloudflare for "verifying you are a human" that goes in infinite loop. Great.

The whole damn internet now depends on them. I guess I am bullish for $NET

Cloudflare runs a high demand service, and the centralisation does deserve scrutiny. I think a good middle ground I’ll adopt is self hosting critical services and then when they have an outage redirect traffic to a Cloudflare outage banner.

So they broke the internet. Nice! Never seen so many sites not working. Never seen so many desktop app suddenly stop working. I don't want to be the person responsible for this. And this again has thought me it's better to no rely on external services. Even though they seem to big to fail.

Luckily for everyone including Guilhermo he can't dunk on the situation since x.com is down as well.

This Internet thing is steadily becoming the most fragile surface attack out there. No need for nuclear weapons anymore, just hit Cloudflare and AWS and we are back to the stone age.

makes you realise, if cloudflare or one of these large organisations decides to (/ gets ordered by a deranged US president to) block your internet access, that's a whole lot of internet you're suddenly cut off from. Yes, i know there are circumventions, but its still a owrrying thought.

Hey, this is fun, all my websites are still up! I wonder how that happened? I don't even have to worry about my docker registry being down because I set up my own after the last global outage.

  • I had a lot of fun like you as well, until I got my first DDoS and bot attacks. There's a reason Cloudflare has 20% of internet traffic.

  • Does it cost you a lot?

    One of my other worries is having fight bots over a couple hobby sites while I have other fires to put out (generally in life).

Since when does critical infrastructure fail weekly?! One week is AWS, then azure + AWS, now cloudfare...

Time to go back to on prem. AWS and co are too expensive anyways

  • A lot of people are "on prem" but use CloudFlare to proxy traffic for DDoS attack mitigation, among other reasons.

everything is down except HN :D

The sites I host on Cloudflare are all down. Also, even ChatGPT was down for a while, showing the error: "Please unblock challenges.cloudflare.com to proceed."

Why are we seeing AWS, then Azure, then Cloudflare all going down just out of the blue? I know they go down occasionally, but it's typically not major outages like this...

10.30pm here in Australia...

and my alarms are going off my and support line is ringing...

I cant even login to my CF dashboard to disable the CDN!

Edit: It's back. Hopefully it will stay up!

Edit 2: 1 Hour Later.

Narrator: It didn't stay up :/

I'm genuinely curious how much of the web depends on cloudflare and AWS. This centralisation sucks though

If someone wanted to learn about how the modern infrastructure stack works, and why things like this occur, where would be some good resources to start?

I can't rebuild my NixOS image because of this lol. (chrome install not working)

Why do people use the reverse proxy functionality of Cloudflare? I've worked at small to medium sized businesses that never had any of this while running public facing websites and they were/are just fine.

Same goes for my personal projects: I've never been worried about being targeted by a botnet so much that I introduce a single point of failure like this.

  • Any project that starts gaining any bit of traction get's hammered with bots (the ones that try every single /wp url even tough you don't even use Wordpress), frequent DDoS attacks, and so on.

    I consider my server's real IP (or load balancer IP) as a secret for that reason, and Cloudflare helps exactly with that.

    Everything goes through Cloudflare, where we have rate limiters, Web firewall, challenges for China / Russian inbound requests (we are very local and have zero customers outside our country), and so on.

  • people think that running nodejs servers are a good idea, and those fall over if there's ever so much as a stiff breeze, so they put cloudflare in front and call it a day.

  • It gives really good caching functionality so you can have large amounts of traffic and your site can easily handle it. Plus they don't charge for egress traffic.

  • I’m surprised your projects aren’t plagued by massive waves of scraping traffic like the rest of us. Count yourself lucky, not superior.

    • What exactly are you serving that bot traffic affects your quality of service?

      I've seen an RPi serve a few dozen QPS of dynamic content without issue... The only service I've had actually get successfully taken down by benign bots is a Gitea-style git forges (which was 'fixed' by deploying Anubis in front of it).

  • It's chic. Young bois or adult pepl with boi like mentality.

    What, they have Cloudflare and we don't? We also must have cloudflare. Don't ask why.

    Now that you have it, you are at least level 15 and not a peasant.

    Same applies to every braindead framework on the web. The gadget mind of the bois is the cause for all this.

There is an election in Denmark today, I wonder if this will affect that. The governments website is not accessible at the moment because it uses Cloudflare.

Meanwhile my Wordpress blog on DigitalOcean is up. And so is DigitalOcean.

My ISP is routing public internet traffic to my IPs these days. What keeps me from running my blog from home? Fear of exposing a TCP port, that's what. What do we do about that?

  • > What keeps me from running my blog from home?

    Depending on the contract it might not be allowed to run public network services from your home network.

    I had a friend doing that and once his site got popular the ISP called (or sent a letter? don't remember anymore) with "take this 10x more expensive corporate contract or we will block all this traffic".

    In general why the ISPs don't want you to do that (in addition to way more expensive corporate rates) is the risk of someone DDoS that site which could cause issues to large parts of their domestic customers (and depending on the country be liable to compensate those customers for not providing a service they paid for)

  • > Our Engineering team is actively investigating an issue impacting multiple DigitalOcean services caused by an upstream provider incident. This disruption affects a subset of Gen AI tools, the App Platform, Load Balancer, Spaces and provisioning or management actions for new clusters. Existing clusters are not affected. Users may experience degraded performance or intermittent failures within these services.

    > We acknowledge the inconvenience this may cause and are working diligently to restore normal operations. Signs of recovery are starting to appear, with most requests beginning to succeed. We will continue to monitor the situation closely and provide timely updates as more information becomes available. Thank you for your patience as we work towards full service restoration.

    It's not down for you, but for others.

  • Yeah, DigitalOcean and Dreamhost are both up. I actually self-host on 2Gig fibre service, and all my stuff is up, except I park everything behind Cloudflare since there is no way I could handle a DDoS attack.

Yesterday I decided to finally write my makefiles to "mirror" (make available offline) the docs of the libraries I'm using. doc2dash for sphinx-enabled projects, and then using dash / zeal.

Then I was like... "when did I last time fly for 10+ hours and wanted to do programming, etc, so that I need offline docs?" So I gave up.

Today I can't browse the libs' docs quickly, so I'm resuming the work on my local mirroring :-)

https://news.ycombinator.com/user?id=jgrahamc

>I was Cloudflare's CTO.

A gentle reminder to not take any CF-related frustrations out on John today.

  • He's now on the Board so not left.

    Not that I think blaming individuals on forums who are already under stress is a good strategy anyway.

  • Oh no, we can’t take a (former) executive to task about what they’ve wrought with their influence!!! That would be wrong.

    If anything, he should be the first to be blamed for the greater and greater effect this tech monster has on internet stability, since, you know, his people built it.

This centralisation is worrisome. Single points of failures have always been a bad idea, especially when that point of failure is out of your control.

PS:Someone really doesn't want Gemini 3 to get air time today

I thought I would be clever by switching domain endpoints from proxied to dns but Cloudflare admin page is also not working correctly ;)

edit: it's up!

edit: it's down!

I wish the "pause CF" button would work via API or via any other way, even if there is an outage like this.

Funny how I couldn't even check on Downdetector.com - because it takes me to a Cloudfare-run captcha, which is now stuck on loading.

The internet is officially down.

I got an invoice from them right before the outage. Hopefully when they restore everything, they'll have forgotten about it!

I had two completely unrelated tabs open (https://twitter.com and https://onsensensei.com) both showing the same error. Opened another website, same error. Kinda funny to see how much of entire web is ran on CloudFlare nowadays.

  • Love how everyone plays with redundancy - multiple hosts, balance loader, etc, and yet half of the web relies on single point of failure being CF

I think everyone is in the same boat with thinking they took something offline :^)

Concerning though how much the web relies on one (great) service.

I sometimes question my business decision to have a multi-cloud, multi-region web presence where it is totally acceptable to be down with the big boys.

  • That was something we discussed at my workplace.

    Prior hosting provider was a little-known company with decent enough track record, but because they employed humans, stuff would break. When it did break, C-suite would panic about how much revenue is lost, etc.

    The number of outages was "reasonable" to anyone who understood the technical side, but non-technical would complain for weeks after an outage about how we're always down, "well BigServiceX doesn't break ever, why do we?", and again lost revenue.

    Now on Azure/Cloudflare, we go down when everyone else does, but C-Suite goes "oh it's not just us, and it's out of our control? Okay let us know when it fixes itself."

    A great lesson in optics and perception, for our junior team members.

  • Many services have just disabled the CF proxy and use only DNS. If your end server has SSL and can handle some traffic, it might work for a while.

Time to check Hacker News instead of work. Even my usual procrastination websites are down due to this.

Cloudflare Dashboard/Clicky clicky UI is down. I really appreciate that their API is still working. Small change in our Terraform configuration and now I can go lunch in peace knowing our clients at skeeled can keep working if wanted:

resource "cloudflare_dns_record"

- proxied = true

+ proxied = false

Down, but the linked status page shows mostly operational, except for "Support Portal Availability Issues" and planned maintenance. Since it was linked, I'm curious if others see differently.

edit: It now says "Cloudflare Global Network experiencing issues" but it took a while.

It is a relief that they hosted the status page on someone else's infrastructure.

You can't even turn off caching from Cloudflare because...the Cloudflare dashboard is down.

So everyone who's wrapped their host with Cloudflare is stuck with it.

No logging in to Cloudflare Dash, no passing Turnstile (their CAPTCHA Replacement Solution) on third-party websites not proxied by Cloudflare, the rest that are proxied throwing 500 Internal server error saying it's Cloudflare's fault…

Feels like half the internet is down.

Investigating - Cloudflare is aware of, and investigating an issue which potentially impacts multiple customers. Further detail will be provided as more information becomes available. Nov 18, 2025 - 11:48 UTC

Yeah, those multiple customers is like 70% of the internet.

It's down here in Sydney as well. The status page hasn't been updated to reflect that

We finally switched to CF a few weeks ago (for bot protection, abusive traffic started getting insane this year), finally we can join in on one of the global outage parties (no cloud usage otherwise, so still more uptime than most).

Wow, so much is down. Nothing Cloudflare protected is loading for me in Indiana, and the Cloudflare dashboard is broken as well.

I hope it gets resolved in the next hour or two, or it could be a serious problem for me.

Gemini and other agents are now failing when they search for something on the web. ChatGPT can't even be accessed.

Well, was reading the docs for Express, and shouted wtf a couple of times, before seeing this post on HN.

I'm thinking about all those quips from a few decades back, along the lines of: "The Internet is resilient, it's distributed and it routes around damage" etc.

In many ways it's still true, but it doesn't feel like a given anymore.

Anyone seeing a link between AI-generated infra code and this year’s wave in popular service outages?

Maybe this incident will make people rethink putting Cloudflare blindly in front of every website.

  • In theory even a single company service could be distributed, so only a fraction of websites would be affected, thus it's not a necessity to be a single point of failure. So I still don't like this argument "you see what happens when over half of the internet relies on Cloudflare". And yes, I'm writing this as a Cloudflare user whose blog is now down because of this. Cloudflare is still convenient and accessible for many people, no wonder why it's so popular.

    But, yeah, it's still a horrible outage, much worse than the Amazon one.

    • The "omg centralized infra" cries after every such event kind of misses the point. Hosting with smaller companies (shared, vps, dedi, colo whatever) will likely result in far worse downtimes, individually.

      Ofc the bigger perception issue here is many services going out at the same time, but why would (most) providers care if their annual downtime does or doesn't coincide with others? Their overall reliability is no better or worse had only their service gone down.

      All of this can change ofc if this becomes a regular thing, the absolute hours of downtime does matter.

      1 reply →

These big cloud providers are turning into giant off-switches for the internet

Some CDNs are down too, for example cdn.tailwindcss.com And apparently I can't log into Hackernews?

We're on the enterprise plan, so far we're seeing Dashboard degradation and Turnstile (their captcha service) down. But all proxying/CDN and other services seem to work well.

Whole bunch of local South African sites are dead, with cloudflare http 500 errors. Can see Lisbon & Amsterdam crashing out.

Oh, look! Cloudflare is down. Let's check down detector to make sure it's not just me > Downdetector is using Cloudflare captcha. Yep, it's down.

Funny that their status page shows almost all locations “Operational” but they’re not. Are they updating the page manually and keep it green?

  • I assume the locations are operating fine, since you can see the error pages. The culprit here is probably the Network, which at the time of writing, shows up as offline

Ah! Well, all of my websites are down! I’m going to take screenshots and have it as part of my Time Capsule Album, “Once upon a Time, my websites used to go down.”

The irony of being in the middle of reading how Basecamp got off the cloud and the external link being down with a CF error :D

Strange thing is this is in multiple CD regions all using bot & WAF are down, just got a colueuge to check our site and both London & Singapour cloudflare servers are out... And I cant even login to the cloudflare dash to re-route critical traffic . Likely this is accidental, but one day there will be something malicous that will have big impacts with how centralised the internet now is.

What a wild ride, the traffic to my site is more akin to a rollercoaster. Got better for a few mins and then fell back apart.

Looks like the status page is suffering too because it can't load jQuery:

(index):64 Uncaught ReferenceError: $ is not defined at (index):64:3

What is funny us that on their global status list for services, everything looks green except "network" that is "offline".

  Our support portal provider is currently experiencing issues

Are they using Cloudflare perchance? (scnr)

I would love to be a bee on the wall in the room where Cloudflare response engineers are working right now.

All trains are stuck in south of France for « broken signalisation ». Wonder how related this is.

Edit: it was related

https://www.laprovence.com/article/region/83645099971988/pan...

Edit2: They edited the article stating it wasn't related.

  • If they do use Cloudflare... why in the everlasting name of Hell did they connect a railway control and signalling system to the Internet?!!!

    • Because javascript programmers are cheaper/easier/whatever to hire? So everything becomes web-centric. (I'm hoping for this comment to be sarcastic but I wouldn't be surprised if it turns out not to be)

      1 reply →

This is ridiculous:

They just posted:

Update We've deployed a change which has restored dashboard services. We are still working to remediate broad application services impact Posted 2 minutes ago. Nov 18, 2025 - 14:34 UTC

but,.. I'm stuck at the captcha that does not work: dash.cloudflare.com Verifying you are human. This may take a few seconds.

dash.cloudflare.com needs to review the security of your connection before proceeding.

They offer a great service for now, i hear.

Unfortunately, that means they can also break 75% of the internet.

When this kind of thing happens it makes me feel better about my own programming problems.

I wonder if it has anything to do with the replicate.com purchase? Probably not.

I started restarting my own servers thinking something went awry again, that's how much I usually trust them not to be down. Interesting.

why do I always get "Server Error" and not an explanation that Cloudflare is having problems? This makes me look bad in front of my customers.

used a down-detector site to check if cloudflare is down, but the site is running on cloudflare, so i couldnt check if cloudflare was down for anyone else, because cloudflare was down

Yep, got around 100 SMSs from our uptime monitoring service that our Cloudflare sites are down. Nothing much we can do but wait.

ERROR [11:57:30 UTC]: EC2 Launch Failure. Reason: [Security Breach Remediation] Control Plane Metadata Service (IMDS) temporarily offline. System state reports: Dependency integrity check failed (Exit Code 0x80070002). Cannot retrieve authorized kernel image or block device mapping. Termination signal initiated for compromised worker nodes.

I was reading up on home lab server racks, and every single site is down with a Cloudflare error. So much for DIY!

Well it was bound to happen eventually, the "Down Roulette" has decided it should be Cloudflare this week!

It’s been 45 minutes and I’m already looking forward to the day Kevin Fang makes a video about this

For fun, I asked google what's an alternative to Cloudflare. It says, "A complete list of Cloudflare alternatives depends on which specific service (CDN, security, Zero Trust, edge computing, etc.) you are replacing, as no single competitor offers the exact same all-in-one suite"

Seems like coudflare activated the maximum llm-scraper-bot-protection for everyone.

Suddenly feeling better about our 99.9% uptime SLA.

When even Cloudflare goes down, nobody can blame the little guys.

API still seems to work if you already have a script to hand to unproxy everything.

Ukraine. Sporadic outages as well. Error pages blame Cloudflare Warsaw servers.

Poland. Most of the popular sites are down. Including community forum on Cloudflare.

I host everything on Linode (have for over a decade) and am never caught up in these outages.

  • Linode has been rock solid for me. I wanted to back this comment with uptime numbers, unfortunately the service I use for that, Uptime Robot, is down because of Cloudflare...

Crazy to think that it's apparently acceptable to centralize the web like that.

Didn't have my site on cloudflare bc it would be faster for chinese users (its main demographic) so i THOUGHT i was fine for a second until i remembered the data storage api is behind cloudflare

AWS, then Azure, now Cloudflare. Welcome to the AI era. Meanwhile my hetzner vServer has been running for three years without issues.

interesting that HN is not on Cloudflare but the YC website is behind Cloudflare so it's also down

Cloud in general was a mistake. We took a system explicitly designed for decentralization and resilience and centralized it and created a few neat points of failure to take the whole damn thing down.

  • Cloudflare provides some nice services that have nothing to do with cloud or not. You can self-host private tunnels, application firewalls, traffic filtering, etc, or you can focus on building your application and managing your servers.

    I am a self-host enthousiast. So I use Hetzner, Kamal and other tools for self-managing our servers, but we still have Cloudflare in front of them because we didn't want to handle the parts I mentioned (yet, we might sometime).

    Calling it a mistake is a very narrow look at it. Just because it goes down every now and then, it isn't a mistake. Going for cloud or not has its trade-offs and I agree that paying 200 dollars a month for a 1GB Heroku Redis instance is complete madness when you can get a 4GB VPS on Hetzner for 3,8 a month. Then again, some people are willing to make that trade-off for not having to manage the servers.

    Cloud servers have taught me so much about working with servers because they are so easy and cheap to spin up, experiment with and then get rid of again. If I had had to buy racks and host them each time I wanted to try something, I would've never done it.

    • Sure, it's a great fair-weather technology, makes some things cheap and easy.

      But in the face of adversity, it's a huge liability. Imagine Chinese Hackers taking down AWS, Cloudflare, Azure and GCP simultaneously in some future conflict. Imagine what that would do to the West.

      I don't believe in Fukuyamas End of History. History is still happening, and the choices we make will determine how it plays out.

    • Thanks, I was too lazy to write this, and noticed this comment multiple times now. It's good to be sceptical at times, but in this case it simply misses the mark.

  • Threat actors (DDoS) and AI scraping already threw a wrench in decentralization. It's become quite difficult to host anything even marginally popular without robust infrastructure that can eat a lot of traffic

  • It took me a while to understand it, but the beauty of it is that when it fails, lot of things fail.

    Almost no one gets mad if your site and half the internet were down.

    • Sure, but that is also a giant weakness. Say in a future conflict with Russia or China, or hell, even North Korea.

      They'd only have to take down a few services to completely cripple the West - the exact case ARPANET was designed to prevent.

      1 reply →

When will Cloudflare actually split into several totally independent companies to remedy that they bring down the Internet every time they have a major issue?

Why the hell is my claude saying "Please unblock challenges.cloudflare.com to proceed."

And then still failing anyway? Why do I need CloudFlare to access claude.io? Wtf?

Lots of valid concern about us all using CF, but is their an alternative to their WAF that isn't enterprise expensive?

ELON! GO AND KICK THOSE CLOUDFLARE ASSES!

or search a new job for yourself. Maybe digging to the earth core. Why? Idk. Because then you can say: I did it, or so.

just yesterday cloudflare announced it was acquiring replicate (ai platform) "the Workers Platform mission: Our goal all along has been to enable developers to build full-stack applications without having to burden themselves with infrastructure" according to cloudflare's blog, are we cooked?

I honestly think people should practice more chaos engineering themselves and switch off services at random like Cloudflare and have failure plans.

I am paying for this shit service and this is my longest downtime I had in years. Can anyone recommend any other bottleneck to be annoyed with in future?

it's funny I first noticed this visiting a random blog, then went on X and got the same error... is Cloudflare the Internet now?

My uptime monitor OnlineOrNot is also down...

  • OnlineOrNot's fallen back to AWS for monitoring, so you should still be getting alerts.

    The dashboard's API server runs on Cloudflare and is currently blocking all logins, will fix.

some sites are already up again, including the cf dash and downdetector, both ironically down a few minutes ago

Cloudflare captchas don't work, which has taken down both Claude and Perplexity for me.

Lovely.

We really do have two surprise holidays every year: AWS Day and Cloudflare Day. Happy outages, everyone.

my cloudflare pages website is down - 500 server error :(

cannot login to get to workers to check - auth errors

I thought this was the point of a cached CDN!

Is anybody keeping statistics on the frequency of these big global internet outages? It seems to be happening extremely frequently as of late, but it would be nice to have some data on that.

Once again vindicated by running my own CDN and not living with the irrational belief that everything needs cloudflare.

this should affect a lot sites? I'm trying to access tailwindcss and I can't as well!

genuinely makes me sad for the people there. this must be a living nightmare right now.

  • Why? If any company has enough technical people, resources & processes in place it must be them, no?

My theory is that people's skills are getting worse. Attention spans are diminishing, memory is shrinking. People age and retire, new less skilled generations are replacing them. There are studies about declining IQ in the last decades. Probably mobile phones and social media are to blame.

We see the signs with Amazon and Cloudflare going down, Windows Update breaking stuff. But the worse is yet to come, and I am thinking about airport traffic control, nuclear power plants, surgeons...

The biggest learning for me from this incident - NEVER make your DNS provider and CDN provider the same vendor. Now, I can't login into the dashboard, even to switch the DNS. Sigh.

chatgpt.com is not working because they are relying on cloudflare for challenges

ChatGPT isn't working.

No suicides created by ChatGPT Today. Billions of dollars in GPU will sit idle. Sudden drop of Linkedin content...

World is a better place

I think Companies are firing wrong people that we get these downtimes so often.

Probably a good time to contact the CEO of Cloudflare.

Looking forward to the post-mortem.

Supabase is down bad too... need to work on my project!

  • Haha they updated their status page: "Identified - A global upstream provider is currently experiencing an outage which is impacting platform-level and project-level services"

    A global upstream provider :)

HN has become the place to check if any HyperScaler + Cloudflare is down.

  • while my colleagues are wondering why cloudlfare isn't working and are afraid it might be something from us locally, I'll first check here to make sure it's not a Cloudflare / AWS problem in the first place.

  • I actually came here to check because downforeveryoneorjustme.com and downdetector are offline as well.

Can we at some point acknowledge that constant cloud disruptions are too costly, and can we then finally move all of our hosting back on-prem?

  • It's the old IBM thing. If your website goes down along with everyone else's because of Cloudflare, you shrug and say "nothing we could do, we were following the industry standard". If your website goes down because of on-prem then it's very much your problem and maybe you get to look forward to an exciting debrief with your manager's manager.

    • That's lazy engineering and I don't think we as technical, rational people should make that our way of working. I know the saying, but I disagree with it. My fuckups, my problem, but at least I can avoid fuckups actively if I am in charge.

      7 replies →

  • Funnily and ironically enough, I was trying to check out a few things on Ansible Galaxy and... I ended up here trying to submit the link for the CF ongoing incident

  • I would only consider doing stuff on-prem because of services like Cloudflare. You can have some of the global features like edge-caching while also getting the (cost) benefits of on-prem.

  • can you define "constant"

    • Well, between AWS US EAST 1 killing half the internet, and this incident, not even a month passed. Meanwhile, my physical servers don't care and happily serve many people at a cheaper cost than any cloud offer.

      7 replies →

I'm really surprised by the sheer scale of how many websites this outage is affecting. We really need to decentralize all of these monolith clouds.

just yesterday cloudflare announced it was acquiring replicate (ai to "help" it's workers) i believe

God my favourite website pornhub.com is also down why onn earth cloudflare i just now came from school.

Couldn't work. Fuckin' cloudflare . Feels like 25% of the Internet is down.

Im going home. Time for a beer .

Greetings from germany

Garmin site not working for example, and they removed the export option from the mobile application though.

This is reason 1, 2 and 3 on my "Top 3 Reasons to not Put All Eggs in One Basket" list.

is there any way to get past challenges.cloudflare.com with tokens or something?

so stupid there is no fallback and can take down 50% of the internet

adding:looks like even Cloudflare's Silk Privacy Pass with challenge tokens is broken

such a great idea to put half the web behind a single fail point without fallover

claude.ai down too... lots of programmers are gonna have to pretend they code in another way...

Telnyx seems to be down for me. Actually I lied, I think it is working. at least call connected.

Feels like 25% of the Internet is down just because of fuckin' cloudflare.

I'm leaving the redaction because I couldn't work atm...

Time for a beer , greetings from germany!

Now I can switch everything off and go home. We are not using CF at our site, but CF error it is a good reason to have a day off

Bluesky still chugging along.

Just saying.

  • They are decentralized with servers all on the East coast that they self host. They do have points of failure that can take down the whole network, however.

Half of the internet is down. That's what you get for giving up the control of the service that suppose to be decentralized to one company. Good, maybe if it costs companies few billions they will not put all eggs in one basket.

Aw man, how dare this affect me personally? :P (Tried to get to openstreetmap.org which is behind cloudflare.)

I'm weary of the broader internet having spofs like AWS and Cloudflare. Can't change routing or DNS horizons to get around it. Things are just broken in ways that are not only opaque, but destructive due to so much relying on fragile sync state.

Will my Spelling Bee QBABM count today, or will it fail and tomorrow I find out that last MA(4) didn't register, ruining my streak? Society cannot function like this! /s

Gemini is up, I asked it to explain what's going on in cave man speak:

YOU: Ask cave-chief for fire.

CAVE-CHIEF (Cloudflare): Big strong rock wall around many other cave fires (other websites). Good, fast wall!

MANY CAVE-PEOPLE: Shout at rock wall to get fire.

ROCK WALL: Suddenly… CRACK! Wall forgets which cave has which fire! Too many shouts!

RESULT:

    Your Shout: Rock wall does not hear you, or sends you to wrong cave.

    Other Caves (like X, big games): Fire is there, but wall is broken. Cannot get to fire.

    ME (Gemini): My cave has my own wall! Not rock wall chief! So my fire is still burning! Good!

BIG PROBLEM: Big strong wall broke. Nobody gets fire fast. Wall chief must fix strong rock fast!

AWS, Azure, now Cloudflare, all within a month, are hit with configuration errors that are definitely neither signs of more surveillance gear being added by government agencies nor attacks by hostile powers. It's a shame that these fine services that everyone apparently needs and that worked so well for so long without a problem suddenly all have problems at the same time.

  • Most or all of these lost significant institutional knowledge through layoff after layoff and jobs moved to lower cost countries.

    Maybe a coincidence or maybe not.

  • AWS was not a configuration error, it was a race condition on their load balancer's automated DNS record attribution that caused empty DNS records. As that issue was being fixed, it cascaded into further, more complex issues overloading EC2 instance provisioning.