← Back to context

Comment by webrobots

6 years ago

Dear customer,

This mail is a follow-up to the previous email we sent (on January 8th, 2020) on this topic. As a reminder, yesterday, we experienced an incident on a storage unit at our LU-BI1 datacenter, located in Luxembourg.

Despite the replication systems in place, and the combined efforts of our technical teams throughout the night, we were unable to reover the data that was lost on the impacted storage unit.

We sincerely apologize for the inconvenience that this situation has caused. This type of incident is extremely rare in the web hosting industry.

In the event that you have a backup of your data, we suggest that you to use it to recreate your server at a different datacenter.

To help you in this, we have provided you with a promo code that will give you one free month for an instance, so that you can create a new Simple Hosting instance in a different datacenter:

    XXX

Wow, for a company that boasts "no bullshit", only offering a month after destroying data and backups seems a little tone deaf

Edit: in fairness, I'm not sure how exactly you would quantify such a loss anyway...

  • It sounds like they didn’t have any backups at all but rather relied on a active-active replication link to a secondary storage.

    Edit: who knows it may be related to the HPE issue.

    https://www.bleepingcomputer.com/news/hardware/hp-warns-that...

    • What baffles me is that there seems to be no way for either the customer or a data-recovery company to flash a new firmware onto the drive after it has failed. Someone there wanted to spare the few millicents of copper trace for a JTAG port?!

      1 reply →

    • Hmm... I wonder what the "incident" was. If it involved something akin to an "rm -rf," then of course their replication link didn't protect them.

    • Perhaps they were depending on snapshotting and were not prepared for some kind of hardware failure taking out the entire storage system.

  • Reputable hosting providers typically don't try to quantify such a loss, but rather outright offer a credit/compensation that is very obviously generous (say, a year or even two of free service).

    Especially when a small set of your customerbase is affected, it won't cost you that much, and "overcompensating" like that means that virtually noone is going to criticize you for quantifying it wrong; instead, the public narrative will be centered around "well, shit happens, they did their best and generously compensated".

I could understand the incident (I would _at least_ start questioning myself about the quality of the service I'm paying), but IMHO this is not something that can be addressed with a casual e-mail that contains few lines of excuses and a "promo code" like it's everyday business. That's astonishing.

Worse than a bad incident there is only bad management of the following situation.

> This type of incident is extremely rare in the web hosting industry.

Why would they include that sentence? Are they trying to imply it is rare for them because it is rare for the industry? Are they saying they are not as good as the industry, so customers should move to other providers? Or are they trying to show they apply the same inattention to their customer communication as they apply to their data backup/recovery practices?

This kind of data loss should simply never happen. It’s one thing to say “it will take us up to 30 days to restore your data because our fast recovery options aren’t working and we have to bring up cold archives”, it’s entirely another to say “your data is gone, tough”.

  • I'm not sure why you've been downvoted for this. I thought the same.

    I read it as: "This type of incident is extremely rare in the web hosting industry, because apparently the overwhelming majority of our competitors aren't capable of fucking up as badly as we just did."

    Doesn't inspire confidence at all, IMO.

  • > Why would they include that sentence?

    They're a French company; it may be a non-native speaker not catching the implication.

    It's also possibly an editing error, e.g. they started writing something like, "these types of incidents are extremely rare and when they happen etc" and most of it was dropped without considering how that changed the implication.

  • I think they're referring to the "incident" that they experienced (on the storage unit in the datacenter), not the situation as a whole. The implication is meant to be that they prepared for many things, but not something as unlikely as this.

  • I think it was meant to say "nobody is infallible", these events are extremely rare, but they /will/ occur, even if you're a customer of the best and biggest players.

A promo code in exchange of your data loss. What a bargain!

  • “Please keep trusting us to host your data”

    • "...marginally more than rolling your own or another cloud provider."

      And to "trust marginally more" simply means:

          gandi_cost_per_month + P(gandi_fails_per_month) * cost_recovery
          < 
          alt_cost_per_month + P(alt_fails_per_month) * cost_recovery

> This type of incident is extremely rare in the web hosting industry.

I read this as "so maybe you should consider one of the other web hosting companies that doesn't have problems like this."

Interesting. The public status page says they’re still waiting for the recovery process to complete.

Is this a response from the company or are you putting it forth as an example response for how to handle this incident better? It’s unclear from your post.