Comment by rco8786

3 months ago

Is it me or has there been a very noticeable uptick in large scale infra-level outages lately? AWS, Cloudflare, etc have all been way under whatever SLA they publish.

55 comments

rco8786

trollbridge 3 months ago

Coincidentally, large tech companies have been conducting mass layoffs and claim they're going to rely on AI much more to replace junior developers.

MaxHoppersGhost 3 months ago

And they are offshoring roles to lower quality devs.
6c696e7578 3 months ago
Interestingly, chatgpt was unavailable due to the same cloudflare outage.
- creatonez 3 months ago
  
  Imagine vibe coding something in production, it breaks half the internet, then you can't vibe code it back because it broke the LLM providers. A real catch-22 for the modern age!
xnx 3 months ago
By similar thinking, you could blame large tech companies if they hired too many juniors.
- teeray 3 months ago
  
  Juniors, at least, have the capacity to learn.
Aurornis 3 months ago
That does seem to be a coincidence, as the recent outages making headlines (including this one according to early reports) have been associated with huge traffic spikes. It seems DDoS are reaching a new level.
- rco8786 3 months ago
  
  AWS's most recent blow-up was not a DDoS
- darknavi 3 months ago
  
  Maybe a laid-off engineer is bored and started orchestrating DDoS campaigns in their newly-found free time.
gtsop 3 months ago

[dead]

alt227 3 months ago

For me the only silver lining to all these cloud outages is now we know that their published SLA times mean absolutely nothing. The number of 9's used to at least give an indication of intent of reliability, now they are twisted to whatever metric the company wants to represent and dont actually represent guaranteed uptime anywhere.

bojangleslover 3 months ago
So true. AWS for example gives only platform credits in the event of an outage. Basically no recourse or insurance.
- op00to 3 months ago
  
  Doesn’t everyone do that? I’ve never worked for a place that the base policy wasn’t credits. You might have special contract language stating otherwise, but for almost everyone, it’s credits.
  
  2 replies →

AsmaraHolding 3 months ago

Some of the other commenters here have posited a "vibe code theory". As the amount of vibe code in production increases, so does the number of bugs and, therefore, the number of outages.

ACCount37 3 months ago
None of the recent major outages were traced down to "vibe coding" or anything of the sort. They appear to be the kind of misconfigurations and networking fuckups that existed since Internet became more complex than 3 routers.
- Seb-C 3 months ago
  
  The "vibe thinking" trend where people stop using their brain and rely on whatever random output the LLM tells them is harder to diagnose, but it's certainly there and at least as bad as vibe coding.
  
  2 replies →
- ceejayoz 3 months ago
  
  How likely are we to know when a "misconfiguration or networking fuckup" is due to someone asking ChatGPT how to do the task?
- shufflerofrocks 3 months ago
  
  >misconfigurations and networking fuckups that existed since Internet became more complex than 3 routers.
  Yet there has been an uptick in frequency of outages only in the recent few months. Correlation correlation.
  Why assume that these misconfigs are not the result of someone asking AI how to do them?
  
  1 reply →
- davey48016 3 months ago
  
  Wasn't the recent AWS a race condition that's existed since before vibe coding was a thing?
fransje26 3 months ago
Speaking of "vibe-coding", I wonder how much their own outage is affecting their ability to vibe-code their way out of it.. :-)
The openai login page says:
Please unblock challenges.cloudflare.com to proceed.
swed420 3 months ago
> Some of the other commenters here have posited a "vibe code theory". As the amount of vibe code in production increases, so does the number of bugs and, therefore, the number of outages.
Likely this coupled with the mass brain damage caused by never-ending COVID re-infections.
Since vaccines don't prevent transmission, and each re-infection increases the chances of long COVID complications, the only real protection right now is wearing a proper respirator everywhere you go, and basically nobody is doing that anymore.
- DaSHacka 3 months ago
  
  Are you being hyperbolic? It's clearly not this, and very likely not GP's proposal either.
  
  8 replies →
- fabioborellini 3 months ago
  
  I have become dumber without having contracted covid or other respiratory diseases (which could have been covid). 2020s have been the era of fascism, war and communities getting torn, which does not really help with stress levels and intellectual performance.

roxolotl 3 months ago

The theory I’ve heard is holiday deploy freezes coupled with Q4 goals creates pressure to get things in quickly and early. It’s all been in the last month or so which does line up.

rco8786 3 months ago

What's different about this Q4 vs the last 20 years of Q4s?
grobins2 3 months ago

The obvious answer is to cancel holidays.

tristanperry 3 months ago

My theory is a state-sponsored actor targeting some of these services, but maybe that's just too 'tinfoil hat' of me, who knows.

wepple 3 months ago

There are usually very comprehensive post mortems for these events, and none have suggested that at all
bflesch 3 months ago

This only amplifies the often-repeated propaganda about the "very powerful" enemies of democracy, who in fact are very fragile dictatorships. There's enough incompetence at tech companies to f up their own stuff.
rozap 3 months ago

My theory is DNS.

whalesalad 3 months ago

Somewhere, at a floating desk behind a wall of lava lamps, in a nyancatified ghostty terminal with 32 different shader plugins installed:

You're absolutely right! I shouldn't have force pushed that change to master. Let me try and roll it back. * Confrobulating* Oh no! Cloudflare appears to be down and I cannot revert the change. Why don't you go make a cup of coffee until that comes back. This code is production ready, it's probably just a blip.

kqr 3 months ago

If it's any guidance, US cyber risk insurance (which covers among other things disruptions due to supplier outages) has continuously dropped in price since Q1 2023, with a handful of percent per year.

If you excuse the sloppy plot manually transcribed from market index data: https://i.xkqr.org/cyberinsurancecost.png

codethief 3 months ago

Don't forget Azure Front Door / half of Azure.