Comment by farhadhf

7 hours ago

Pretty much everything is down (checking from the Netherlands). The Cloudflare dashboard itself is experiencing an outage as well.

Not-so-funny thing is that the Betterstack dashboard is down but our status page hosted by Betterstack is up, and we can't access the dashboard to create an incident and let our customers know what's going on.

Edit: wording.

120 comments

farhadhf

huijzer 6 hours ago

Yep that's also my experience. Except HN because it does not use *** Cloudflare because it knows it is not necessary. I just wrote a blog titled "Do Not Put Your Site Behind Cloudflare if You Don't Need To" [1].

[1]: https://huijzer.xyz/posts/123/

firecall 5 hours ago
Sadly, AI bots and crawlers have made CF the only affordable way to actually keep my sites up without incurring excessive image serving costs.
Those TikTok AI crawlers were destroying some of my sites.
Millions of images served to ByteSpider bots, over and over again. They wouldn't stop. It was relentless abuse. :-(
Now I've just blocked them all with CF.
- flakeoil 5 hours ago
  
  > Now I've just blocked them all with CF.
  Yeah, they for sure let nothing through right now. ;)
  
  1 reply →
- zenmac 5 hours ago
  
  Wouldn't it be trivial to just to write a uwf to block the crawler ips?
  At time like this really glad we self-hosted.
  
  6 replies →
- unethical_ban 3 hours ago
  
  I don't understand. What exactly are they doing, what are their goals? I'm not trying to argue, I genuinely don't get it.
  edit: I guess I understand "AI bots scraping sites for data to feed LLM training" but what about the image serving?
- Aeolun 5 hours ago
  
  > Now I've just blocked them all with CF.
  You realize it was possible to block bad actors before Cloudflare right? They just made it easier, not possible in the first place.
  
  2 replies →
MinimalAction 6 hours ago
Yes, I never understand this obsession for centralized services like Cloudflare. To be fair though, if our tiny blogs anyway had a hundred or so visitors monthly, does it matter if it had an outage for a day?
- ThunderSizzle 5 hours ago
  
  I think partially is not having to worry about certs is a nice reason to hide behind the proxy. Also, to help hide your IP address, I guess.
  Of course, on the other hand, I know that relying on Cloudflare cert's is basically inviting a MITM attack.
  
  5 replies →
ramon156 5 hours ago

Last time I tried this I got DDoS'd so I don't see a reason to step away from CF. That said, this is the price I pay
Illniyar 6 hours ago
Does HN not experience DDOS? I would imagine being as popular as it is it'll experience DDOS.
- q3k 5 hours ago
  
  It does: https://m5hosting.status.io/pages/incident/5407b8e2b00244251...
  But turns out that's fine :).
  
  1 reply →
zzzeek 5 hours ago
~~two~~ three comments on that:
1. DDOS protection is not the only thing anymore, I use cloudflare because of vast amounts of AI bots from thousands of ASNs around the world crawling my CI servers (bloated Java VMs on very undersized hosts) and bringing them down (granted, I threw cloudflare onto my static sites as well which was not really necessary, I just liked their analytics UX)
2. the XKCD comic is mis-interpreted there, that little block is small because it's a "small open source project run by one person", cloudflare is the opposite of that
3. edit: also cloudflare is awesome if you are migrating hosts, did a migration this past month, you point cloudflare to the new servers and it's instant DNS propagation (since you didnt propagate anything :) )
- dboreham 4 hours ago
  
  Why are your CI servers open to the public network?
  
  1 reply →

pell 7 hours ago

It’s that time of the year again where we all realize that relying on AWS and Cloudflare to this degree is pretty dangerous but then again it’s difficult to switch at this point.

If there is a slight positive note to all this, then it is that these outages are so large that customers usually seem to be quite understanding.

isodev 7 hours ago
Unless you’re say at airport trying to file a luggage claim … or at the pharmacy trying to get your prescription. I think as a community we have a responsibility to do better than this.
- ChrisMarshallNY 6 hours ago
  
  > I think as a community we have a responsibility to do better than this.
  I have always felt so, but my opinion is definitely in the minority.
  In fact, I find that folks have extremely negative responses to any discussion of improving software Quality.
  
  2 replies →
- sigilis 6 hours ago
  
  You aren’t cloudflare’s customer in these examples. It depends on the companies that are actually paying for and using the service to complain. Odds are that they won’t care on your behalf due to how our society is structured.
  Not really sure how our community is supposed to deal with this.
  
  1 reply →
dlisboa 6 hours ago

> If there is a slight positive note to all this, then it is that these outages are so large that customers usually seem to be quite understanding.
Which only shows that chasing five 9s is worthless for almost all web products. The idea is that by relying on AWS or Cloudflare you can push your uptime numbers up to that standard, but these companies themselves are having such frequent outages that customers themselves don't expect that kind reliability from web products.
tommica 7 hours ago

> It’s that time of the year again
It's monthly by now
lbreakjai 7 hours ago
If I choose AWS/cloudflare and we're down with half of the internet, then I don't even need to explain it to my boss' bosses, because there will be an article in the mainstream media.
If I choose something else, we're down, and our competitors aren't, then my overlords will start asking a lot of questions.
- stevepotter 6 hours ago
  
  Yup. AWS went down at a previous job and everyone basically took the day off and the company collectively chuckled. Cloudflare is interesting because most execs don’t know about it so I’d imagine they’d be less forgiving. “So what does cloudflare do for us exactly? Don’t we already have aws?”
- jfengel 6 hours ago
  
  And if everyone else is down, and you are not, you will get no credit.
  
  3 replies →
- timeon 6 hours ago
  
  In reality it is not half of the internet. That is just marketing. I've personally noticed one news site while others were working. And I guess sites like that will get the blame.
fusl 7 hours ago
Happy to hear anyone's suggestions about where else to go or what else to do in regards to protecting from large-scale volumetric DDoS attacks. Pretty much every CDN provider nowadays has stacked up enough capacity to tank these kind of attacks, good luck trying to combat these yourself these days?
- trollbridge 6 hours ago
  
  Somehow KiwiFarms figured it out with their own "KiwiFlare" DDOS mitigation. Unfortunately, all of the other Cloudflare-like services seem exceptionally shady, will be less reliable than Cloudflare, and probably share data with foreign intelligence services I have even less trust for than the ones Cloudflare possibly shares them with.
- isodev 7 hours ago
  
  Anubis and/or Bunny are good alternatives/combination depending on your exact needs
  - https://anubis.techaro.lol/
  - https://bunny.net/
  
  8 replies →
- bandrami 6 hours ago
  
  Is a DDOS more frequent and/or worse than stochastic CDN outages?
- q3k 6 hours ago
  
  Just accept that a DDoS might happen and that there's nothing you can do about it. It's fine, it's just how the Internet works.
  
  12 replies →
weird-eye-issue 7 hours ago
Oh no, we had 30 minutes of downtime this year :(
- CableNinja 6 hours ago
  
  5 9's is like 7 minutes a year. They are breaking SLAs and impacting services people depend on
  Tbh though this is sort of all the other companies fault, "everyone" uses aws and cf and so others follow. now not only are all your chicks in one basket, so is everyone elses. When the basket inevitably falls into a lake....
  Providers need to be more aware of their global impact in outages, and customers need to be more diverse in their spread.
  
  4 replies →
- pell 7 hours ago
  
  I do think this is tenable as long as these services are reliable. Even though there have been some outages I would argue that they’re incredibly reliable at this point. If though this ever changes the costs to move to a competitor won’t be as simple as pushing a repository elsewhere, especially for AWS. I think that’s where some of the potential danger lies.
  
  6 replies →

postalcoder 7 hours ago

Cloudflare dashboard is down-ish, not totally down. If you're persistent you can turn off the turnstile and proxy.

It took a few minutes but I got https://hcker.news off of it.

trollbridge 6 hours ago
I can't sign in since Turnstile is down so I can't complete the captcha to log in.
I also can't log in via Google SSO since Cloudflare's SSO service is down.
farhadhf 6 hours ago
I'm already logged in on the cloudflare dashboard and trying to disable the CF proxy, but getting "404 | Either this page does not exist, or you do not have permission to access it" when trying to access the DNS configuration page.
skywhopper 6 hours ago

Not saying not to do this to get through, but just as an observation, it’s also the sort of thing that can make these issues a nightmare to remediate, since the outage can actually draw more traffic just as things are warming up, from customers desperate to get through.
But then, that’s what Cloudflare signed up to be.

celltalk 6 hours ago

I think there is a big business opportunity here. Make a site that let companies put their status update on local vps for $100.

alt227 6 hours ago
Atlassian has this business model sewn up
https://www.atlassian.com/software/statuspage
- lc64 5 hours ago
  
  It's worth noting that cloudflare's status page is hosted there. Pretty good proof that it works
  
  1 reply →
codethief 6 hours ago
Maybe that's precisely what Cloudflare did and now their status page is down because it's receiving an unusual amount of traffic that the VPS can't handle.
- celltalk 6 hours ago
  
  They should have had Cloudflare on it.
colinbartlett 6 hours ago

Even the Cloudflare status page, hosted by Atlassian Statuspage, is suffering. Probably due to the traffic crush.
nrhrjrjrjtntbt 6 hours ago

Status pigeons.
ramon156 6 hours ago

on-demand status balancing!

compumike 5 hours ago

Could always just use a status page that updates itself. For my side project Total Real Returns [1], if you scroll down and look at the page footer, I have a live status/uptime widget [2] (just an <img> tag, no JS) which links to an externally-hosted status page [3]. Obviously not critical for a side project, but kind of neat, and was fun to build. :)

[1] https://totalrealreturns.com/

[2] https://status.heyoncall.com/svg/uptime/zCFGfCmjJN6XBX0pACYY...

[3] https://status.heyoncall.com/o/zCFGfCmjJN6XBX0pACYY

jcfrei 4 hours ago
This is unrelated to the cloudflare incident but thanks a lot for making that page. I keep checking it from time to time and it's basically the main data source for my long term investing.
- compumike 4 hours ago
  
  I appreciate that, thank you! :)

biinjo 7 hours ago

Same here. We’re using OhDear. The status page is available but I can’t post an incident because their service is also behind Cloudflare.

Mojah 7 hours ago

Co-founder here, we'll be working on better ways to handle this over the coming days.
Update: our app is available again without Cloudflare, you'll be able to post updates to status pages smoothly again.

davedx 6 hours ago

All my stuff is working. Things on GCP. Things on Fly.io. Tooling I use.

"Only" 10% of the internet is behind Cloudflare so far ;)

grabcadder 5 hours ago
Happy for you :)
I am curious about these two things:
1- Does GCP also have any outages recently similar to AWS, Azure or CF? If a similar size (14 TB?) DDoS were to hit GCP, would it stand or would it fail?
2- If this DDoS was targeting Fly.io, would it stand? :)
- davedx 5 hours ago
  
  I actually spoke too soon, and accept I have egg on my face!
  Apparently prisma's `npm exec prisma generate` command tries to download "engine binaries" from https://binaries.prisma.sh, which is behind... guess what...
  So now my CI/CD is broken, while my production env is down, and I can't fix it.
  Amazing lol
- progbits 5 hours ago
  
  For GCP network that would be a rounding error. Of course GCP sometimes has outages too, all providers do.

talonx 4 hours ago

BetterStack did report issues with some of their services, but they were not very informative.

esskay 7 hours ago

When its back up, do yourself a favour and rent a $5/mo vps in another country from a provider like OVH or Hetzner and stick your status page on that.

"Yes but what if they go down" - it doesnt matter, having it hosted by someone who can be down for the same reason as your main product/service is a recipe for disaster.

fodi 6 hours ago

Definitely. Tangentially, I encountered 504 Gateway Timeout errors on cloudflarestatus.com about an hour ago. The error page also disclosed the fact that it's powered by CloudFront (Amazon's CDN).
jwr 6 hours ago

Or use a service like https://updown.io/ (I host my status page there).
hcaz 7 hours ago
https://cachethq.io/ is great for this
- jwr 6 hours ago
  
  Amusingly enough, it is down right now because of Cloudflare :-)
- fusl 7 hours ago
  
  Been using Cachet for quite a while before inevitably migrating to Atlassian's Statuspage.io. I'm a huge fan of self-hosting and self-managing every single thing in existence but Cachet was just such a PITA to maintain and there was just no other good alternative to Cachet that was also open source.

touristtam 4 hours ago

Thankfully the usual social media are still up ... oh wait https://www.bbc.co.uk/news/articles/c629pny4gl7o

ablation 7 hours ago

This is a big one.

csomar 7 hours ago

Seems like workers are less affected and maybe betterstack has decided to bypass cloudflare "stuff" for the status pages? (maybe to cut down costs). My site is still up though some GitHub runners did show it failed at certain points.

tyingq 7 hours ago
I have a workers + kv app that seems fine right now.
- csomar 7 hours ago
  
  Pretty sure they went down for a while because I have 4xx errors they returned but apparently it was short-lived. I wonder if their workers infra. failed for a moment and that let to a total collapse of all of their products?

chrisandchris 6 hours ago

I don't get why you need such a service for a status page with 99.whatever% uptime. I mean, your status page only has to be up if everything else is down, so maybe 1% uptime is fine.