← Back to context

Comment by celestialcheese

17 hours ago

Azure migration is the most plausible explanation I've heard. https://news.ycombinator.com/item?id=45517173

GitHub claims that AI development tools have caused a massive surge in demand in recent months. They need to scale by 30X to keep up with demand.

According to GitHub, Azure migration is the attempt at a fix/upscaling, not the underlying cause of the issues.

Addressing GitHub’s recent availability issues: https://github.blog/news-insights/company-news/addressing-gi...

An update on GitHub availability: https://github.blog/news-insights/company-news/an-update-on-...

  • Github is claiming that a usage spike in 2026 is the cause of availability issues in 2025, so their explanation is clearly incomplete at best. The usage spike may be why things have failed to get better despite them putting effort into improving things, but it isn't the root cause of problems.

  • But the outages have been getting worse and worse even before anything related to AI took off.

    The issue is that they're not a scrappy startup anymore, they are defacto running the internets development infrastructure and are owned by a trillion dollar company.

    So the bar they're measured by has changed and they haven't even tried to keep up, paying lip service to reliability when you are critical infrastructure is not going to go well.

    There were reliability issues in 2010 for sure, but it feels worse now; the period before acquisition was the most stable (2014-2017).

  • > GitHub claims that AI development tools have caused a massive surge in demand in recent months. They need to scale by 30X to keep up with demand

    They said they're designing for a future that would require 30X of today's scale.

    They did not say that they need to scale 30X to meet today's demand.

    To be fair, the "demand is up 30X" claim was spammed all over social media so it's easy to see why this topic is so misunderstood

  • Funny how windows updates are never postponed for lack of "scaling". I know, I know, completely different stuff here - but arent test vms and ci vms being updated constantly?

    Im old enough to remember the hotmail migration to win2k (then 2k3) and the postmortem. I was also old enough to look at the rotor source code. Yah, that one, running managed code in freebsd.

  • Their own greed is causing their issues. They could be doing a million different things to reduce demand, but they don't want to dampen their current growth and have opted to continue scaling up at the cost of quality.

  • Then they shouldn't be encouraging AI development tool usage.

    I've never pushed a commit and thought huh, I wonder what copilot thinks of this.

Azure also regularly has incidents due to capacity issues in several regions, so that many Azure-managed services also go down. Some of those incidents have been open continuously for many months now.

OMG I was a few FAANG-like companies. Most acquisitions failed due to the migration. It always played in the same way:

1. The acquired company was small company to the acquirer. 2. We need to improve scalability and reduce cost!

Then, they migrated. The new system was worse and didn't have parity. It was years. Customers were moving off. The project/product shut down.

I glanced at zhe thread you linked. And as I understand they are in the process of migrating, which will take more than a year still.

If that’s the case, then it’s not necessarily a problem with Azure itself.