Comment by diggan

1 year ago

Why is the title of the post-mortem "GitHub Outage"? It makes it sound like Lovable somehow brought down GitHub, when in reality it seems like they were rate-limited by GitHub for creating lots of repositories, then got their GitHub App completely blocked for breaching the Terms of Service.

> Incident report for the GitHub outage on January 2-3, 2025

Writing it like that looks like you're pushing the blame on your downtime/outage to GitHub, like they're responsible for your application to be up, instead of taking full responsibility for it.

> Writing it like that looks like you're pushing the blame on your downtime/outage to GitHub, like they're responsible for your application be up, instead of taking full responsibility for it.

Well, they did what they were supposed to - they explicitly asked Github what they were up to, Github gave an explicit "we're ok with this, go ahead", and once Github sees that, whoops, it's causing errors they don't even bother to check if there are support tickets open with the customer, they just go and disable their access.

  • But that's true for any 3rd party you'd depend on. Everything will work until it doesn't. Doesn't mean you're less responsible for your project being down.

    A title like "GitHub caused our outage" would still make it clear the downtime wasn't the direct action of anyone on the team, yet still take responsibility over that it happened. Instead, labeling it "Incident report for the GitHub outage" just seems like straight up blaming someone else.

    • Well, Github explicitly took responsibility. The first action Github did once Lovable reached out for support was "reinstate our app and apologize for the issues it caused us and our users."

      And no, you are not responsible for every 3rd party service you use. Some services are unavoidable, some services are just nice-to-have, but if you can't trust a service to perform its advertised function, it is the service's fault.

      4 replies →

  • If I read it correctly, it was a support person that provided them with assurance. Not an executive or vice president or manager or vp of sales. GitHub did not give them permission nor their approval; it was a single person in support department.

    Who relies on support people to determine the basis of their business when it’s obvious that they were concerned with the high usage rate and that it might cause problems for their customers?

    • Oh yeah, support staff aren't always aware of everything. For example, with one of our latest features, we didn’t have time to add a UI option to disable the feature. The expectation was that support staff could disable it via their special admin panel upon user request. However, I accidentally discovered that when users asked to hide the feature, tech support told them it wasn’t possible! It turned out the tech support lead forgot to share that information with the team.

      As for the OP, they should have conducted load testing and implemented rate limits on their end, rather than blindly relying on someone’s word that GitHub was ready to handle all their product's load for free.

  • sounds like a pretty sane thing to do: github protects the majority of their customers from instability caused by a few.

    unless you're a super important partner, the people on call might never have heard of your little app and just decide that it's the only safe thing to do to protect the reliability of the system.