Gandi loses data, customers told to use their own backups

6 years ago (status.gandi.net)

Whoops, so long with the "no bullshit" policy.

I stopped using them a while ago but for a different reason. I used to use their website to check availability/whois for domains that I was interested in buying. If it was available I didn't buy it at the time but until I finished the website/app whatever I was going to put there, this took me a few months obviously. It happened to me that when I was finally ready the domain had already been sold to someone else. This repeated five times during six or so years. Now, I know, "someone else could have thought the same thing" but I find it very hard to believe that it happens so often. These domains were a bit of niche words that were not hot topics at the time, some of them using fairly uncommon TLDs (like .one). Another weird thing is that they were always registered to someone living/or doing business at India, and it was a fairly simple landing page with a "contact me" link. I'm a bit superstitious so I don't think it was a coincidence.

Now, I don't think this is a GANDI problem per se, but my theory is that they share this information (who is looking for which domains) with marketers or something like that, or maybe it was a rogue employee trying to make some money squatting domains. I would have expected this from BigDaddy or similar sharks, but from a company whose motto is "no bullshit" I had much better hope. Anyway, I decided to move (to namecheap if you're wondering) and surprisingly the problem went away.

  • Domain registrars giving away domains to squatters when people search for them is a time honored practice. The advice I've seen is to not search for the domain first, just register it outright from the start. The domain registration business is run with all of the integrity and customer focus of TicketMaster.

    • Really? I may have just found myself a new hobby. Search for incredibly unique (and worthless to me) domains to see if I can get people to squat on them. Heck, it could be a game ... I could get all my friends to make bingo boards ... or maybe see if I can think of some scrabble like rules.

      21 replies →

    • >The advice I've seen is to not search for the domain first, just register it outright from the start. The domain registration business is run with all of the integrity and customer focus of TicketMaster.

      What if you don't want to cough up $10-$20 on a whim? Would doing a whois (using the NIC's whois site) suffice?

      11 replies →

    • On Namecheap they have a "save for later" button per domain in your cart, so you can just pile them up for later.

      Never had even 1 of my "saved for later" being squatted, for several years now, so I really have to trust and commend namecheap in that regard.

  • They never had a real "no bullshit" policy. When I had my domains with them, I had been asked to verify my identity 34 times in 12 months. 34 separate fucking times. Because "ICANN says so" or some stupid shit (their words, not mine). It stopped the moment I moved to Google Domains, where they asked once and never again.

    EDIT: And, to make things worse, each time I was threatened with the "confiscation" of my domain, and the round trip on the tickets was so high that each instance took 2-3 days to resolve. Frustrating as hell.

    • Since you're giving a anecdote, let me do the same. I have about 30-40 domains with Gandi, and have been using it for about ten years. I don't remember ever verifying my identity, but guess I most have done it at least once. I have not been asked to verify anything for at least the last five years of using it.

      Disclaimer: I don't work there or have any relationship, except I'm a happy customer

      6 replies →

    • On the other hand, i've started using them more for DNS because the one time I forgot my password (typo in password manager I think) they made it very difficult for me to reset it, asked for pieces of ID, phone number registered in my name etc...

      This is at the time of the stories of other registrars giving customer second and third chances to guess their PIN, or credit card or whatever mechanism they had, and resulting in domain hijacking.

      1 reply →

    • >They never had a real "no bullshit" policy. When I had my domains with them, I had been asked to verify my identity 34 times in 12 months. 34 separate fucking times. Because "ICANN says so" or some stupid shit (their words, not mine).

      Can you elaborate what the "verification" entails? There is an ICANN requirement[1] to validate whois information, although I've only been asked to validate email (at another registar, not ghandi).

      [1] https://www.icann.org/resources/pages/approved-with-specs-20...

      1 reply →

    • Jeez. Like most experiences I suppose, it's hit and miss. They've promptly resolved every problem I've had and I've bought plenty of domains through them.

      They're still my go-to provider.

    • Gandi was deploying the ID verification as a bullying tactic long before anything like that was mandated by ICANN. (Not that ID verification is even mandated now)

  • I've heard that this isn't actually the registrar's fault, it's the registry's fault. So your TLD is sharing the "is registered" query with other parties. That said, everything about the domain registration industry seems designed to appear sketchy as all hell, so who knows.

  • If you want an alternative to searching with a registrar you can always type:

      whois mysupergreatcoolappidea.com
    

    into a terminal window and see if you get back a result

    • WHOIS and DNS requests are made to nameservers run by the registry (not registrars), it's possible for the registry to front-run domains if they intend to.

  • This happened to me with GoDaddy and Namecheap before, which is why I switched to using Gandi for all my domain searches... Now I'm regretting it!

    But as @Jasper_ said, this could be a problem with the domain name registry selling/leaking that info (AKA all their 'is_available' queries), and not the registrar.

    • At one point, I believe a GoDaddy VP was doing this as a personal side business. For many reasons, GoDaddy is the shadiest of them all.

    • You shouldn't regret it. GoDaddy was and still is way worse than this, there is no comparison.

    • I was always told that Namecheap does not engage in this practice.

      It's a single data point, but I instructed a client to search for domains on Namecheap last year since they were undecided. I just didn't want them to use GoDaddy, and I warned them why. They settled on a domain but registered it months afterwards. It was still available.

      1 reply →

  • It could be your local DNS resolution that is leaking to bad actors. It would be kind of stupid and self defeating for registrars to undercut their own customers. I would expect that some have done this in the past, but would be very surprised if it is done at Gandi with their knowledge... and undoubtedly French law would not smile kindly on such behavior.

  • Although the current issue with the irrecoverable data loss is terrible, I thought (in this case, at least) that they were surprisingly honest. They straight up said the data is gone (a VERY hard thing to publicly admit), and informed people they need to restore from their own backups. That seems pretty No Bullshit to me, no?

  • I have had a similar experience with other registrars.

    Edit: Sad to hear of the data loss and for anyone affected. Trusting cloud providers doesn't always work out either.

  • Reports of reputable registrars front-running are persistent, but unfounded. Anytime I’ve looked into it, I’ve never seen any evidence for it.

    If proven it would be a major blow to their business, so why would they try to snatch pennies from in front of a steam roller?

    So I call b.s. on any reports of “the registrar noticed me searching for a domain and registered it”.

    • Um, NetSol settled a $1MM class action suit over exactly this about a decado ago.

      It absolutely has happened and quite possibly still does.

  • Would `whois` be any better to prevent leakage of domains you intend to register?

Oof, this Twitter thread looks particularly bad, especially the response from the official Gandi account.

https://twitter.com/andreaganduglia/status/12151991477012316...

While I appreciate that there are real people behind these companies that are probably having a really rough time right now, the criticism that Gandi are getting as a company is justified - and if Gandi are truly a "no bullshit" company they need to put something out to their customers asap.

  • Screenshotted in case (when) they delete it https://i.imgur.com/s3R1VVc.png

    Using memes after permanently losing customer data is extremely disrespectful.

    • "Julie Pelloille @juliepelloille Replying to @gandi_net @andreaganduglia and 4 others

      This post was disrespectful. It's not an excuse, but this is a stressful situation and the thread was getting heated. Either way, I truly regret posting it and it was my decision alone to do so. Please don't take this as representative of the high standard Gandi sets"

      "That said, for the sake of transparency, we won't be deleting the tweet -- Julie"

      2 replies →

    • ...not losing data is the ONE thing I expect companies to get right. I could handle downtime, circular customer support, high prices, horrible UX, and all that. But losing or corrupting data? Heck no.

      A company that loses customer data in production is the exact type I would expect to mock their customers using memes.

      1 reply →

    • I don't blame the communications rep. From her perspective, she's probably been told what the CEO believes - Gandi lost data, but they never promised backups so it's not a big deal. They responded to someone that is being extremely critical. The rep (Julie) did the right thing and apologised after others criticised her tweet, and also kept the response up to illustrate the mistake. While a meme is bad taste, I can somewhat understand the reaction.

      IMO, the blame lies solely with the CEO, because he is still to retract his statement regarding snapshots not being backups (despite their site selling them as backups to the end-user), and for not accepting the fact that for someone controlling business data that creating backups AND regularly testing them via restores is 100% essential. Culture trickles down, and if the CEO only accepts blame and not the reason for the blame then it's a sign that they won't learn from the problem - and that's the biggest red flag you will ever see in ANY business.

      I can only see one way back for them that won't taint their reputation completely. They need to:

      * Post a full post-mortem of what happened, how it happened, how they fixed it, and what they're going to do to ensure it never happens again.

      * Issue a full apology for the problem. Accept full blame, and accept (including the CEO on Twitter) that Gandi failed to follow accepted industry standards.

      * Sit down with the engineers that work at Gandi and hear their grievances. While I doubt that their engineers knew this would happen, I'd be willing to bet that there is at least one person there that had raised the lack of off-site backups and no recovery mechanism. That person needs a promotion, and whatever resources needed to fix Gandi.

      * Issue a full refund to those that lost data - not a small discount, as already reported. A discount is a kick in the teeth, whereas a full refund is the start of a real apology for failing the customer. If you go for a meal at a restaurant and find broken glass in your food, the first thing the server will do is give you a full refund, no questions asked, regardless of how expensive your parties order was. Gandi need to take the hit, and live to fight another day.

  • This is god damn unbelievable

    "Andrea, sorry about that and the incident. If we led you to believe that you had nothing to do on your side when warned multiple times to make your back ups, then we'll have to make it clearer, and stop assuming that it's an industry wide knowledge."

    • That's such an obnoxiously passive-aggressive response from the CEO. Bit of a red flag for the company culture.

    • After another support person made a joke in response to his very serious post.

      This is one of the worst responses I’ve ever seen from a company, and I’m not being hyperbolic.

    • Seeing the thread I couldn't believe they are being serious. Feels like they are playing a tasteless prank. Such crass and careless attitude is downright repelling.

  • The number of people who have control of social media accounts for companies who do not understand how to relate to people / basic customer service / can predict how their post will be received is shocking.

    I worked at a company of 5K+ people and one of the folks in control of the twitter account(s) would come to me with questions.

    Now I applauded them for coming to me for technical questions before posting, that was great, but they absolutely did not have the self awareness / understand what to say / when to say it and etc.

    But hey they were tied to a high ranking person (who also had no clue) so they had access to the account.

    In my early days I worked PC customer support... I feel like that comes in handy all the time.

  • Wow. I've never used Gandi but I have seen it recommended before as a low-cost option. I will actively encourage people to avoid it from now on. That's scary.

    • Gandi has never been a low cost option, they've always been on the high to extreme higher end of things for individual cost...

      Especially for random ccTLDs, they're often significantly more expensive than the alternatives.

      Random selections for domains: .ru is $1-3 most anywhere else, Gandi is $18.

      2 replies →

    • They used to be very good if you wanted a non-scammy registrar with a huge selection of TLDs and ccTLDs. However, in recent years, success seems to have gone to their head and the service is nothing like it used to be (plus their latest control panel UX is an abomination).

      Feels like the CEO has made his money, forgotten the company's roots in the process and is happy for Gandi to be just another generic, overpriced registrar running on auto-pilot.

  • I don't get the criticism.

    If they lost all the data, then obviously the only option for customers is to either use their own backups if they have them or accept that the data is permanently lost.

    One can criticize their lack of additional redundancy, but don't see what's wrong with the response.

    • Sure, if the data is lost there isn't much that can be done to go back and fix it. However, the company response appears very dismissive/flippant which sends a bad message.

      The tone any company hosting customer data should take in the event of data loss is along the lines of 'regretfully... we screwed up... unfortunately... steps we are taking to ensure this doesn't happen again...' i.e. the company should either be humble and apologetic or they should expect to lose a large chunk of their customers after something like this. This isn't merely to say the right thing, it is to demonstrate that they acknowledge this was their issue and something they need to fix going forward rather than a 'sucks to be you' customer issue. This is basic customer relations / crisis management stuff.

      1 reply →

    • The customers are being stupid and rude: assigning blame, asking redundant questions, making threats. Nothing in any of the twitter threads I've seen has any potential to solve any problems, they're yelling thinly veiled abuse at support.

      The industry standard is sucking up to them and groveling, and it's led to customers being very badly behaved.

      The trouble is no one has a good working alternative to the industry standard.

      Gandi certainly doesn't, they're not responding in a well thought out manner, they're losing their cool and getting angry with their customers. That's a quick way to go out of business.

      3 replies →

    • I’m pretty repelled by their tone in that thread. Sweeping it under the rug (could’ve happened to anyone / shit happens) instead of just owning up to it. Throwing in that completely inappropriate meme. Contradicting their marketing material when it’s convenient (are snapshots backups?) and general passive aggressiveness.

Just to play devil's advocate: This is in no way different to how Azure, AWS, and GCP operate. They don't have backups either. They too rely on n-way replication, a bit like a distributed RAID.

All cloud providers make it absolutely clear, in black & white, that protection of your data is your responsibility, not theirs.

What I find hilarious is that most cloud providers only provide built-in backup functionality for a tiny subset of their services.

Ask Microsoft if you they have a "backup" button for Azure DNS Zones. Or Azure load balancers. Or anything else that isn't a VM disk, App Service, SQL Database, or a Secrets Vault.

I mean, look at this insanity: https://docs.microsoft.com/en-us/azure/backup/backup-azure-f...

"Backup for Azure file shares is in Preview."

After 10 years of operation, this trillion-dollar company has only a use-at-your-own-risk beta for data protection!

Don't be too hasty to point fingers at Ghandi and laugh about how they're unprofessional. Whatever you're using is essentially the same.

Ask yourself this: Could your organisation recover if some malicious admin simply deleted all Azure Resource Manager resources in one go using PowerShell?

  • Everything you say here is true, but at the same time it's just a fact that Gandi lost a lot of customers' data, and AWS, GCP, and Azure have never (as far as I know) lost a significant amount of it at once. You can talk about theoretical responsibility for data, and it's true, you are responsible for having backups of your data, no matter how many "9s" the service has, but the basic fact is that some services have been consistently good at not losing customer data, and others haven't. Even though I'm going to back up my data no matter where it is, I'd still rather use the service that's got a better track record with it.

    I haven't ever even lost a file on Google Drive, which as far as I know provides no reliability guarantees at all.

    • Back in the early days GMail lost customer data due to storage corruption. It has happened.

      The rarity is immaterial, the responsibility for data protection lies with you, not them.

      3 replies →

  • That's kind of like saying there's no difference in safety between an airliner and the winged contraption that my idiot brother built in his garage.

    After all, they both have wings and will both kill you if they fall out of the sky, and I don't see Airbus or Boeing guaranteeing that their planes will never crash, so they must be essentially the same.

  • > Ask yourself this: Could your organisation recover if some malicious admin simply deleted all Azure Resource Manager resources in one go using PowerShell?

    We have streaming replicas for hot data AND regular snapshots shipped to offsite cold storage, because RAID is not a backup. If we experienced an equivalent event, we'd be fine.

    • The equivalent scenario to recovering from a bulk erasure of all Azure RM resources is this:

      How long will it take you to recover if someone deleted your switch configs, reset the SAN to factory defaults, wiped you firewall rules, deleted you Active Directory accounts (or equivalent), and then ran a secure erase on every every physical server just to raze everything to the ground and salt the earth?

      I mean in wall-clock time, how long would it take your team to even figure out what is going on? Where would you start?

      Would you recover the switch first, or the server that you use to authenticate to it using RADIUS or LDAP?

      How will you securely connect to servers if your CRL and OCSP servers are down?

      How will you get access to your passwords if your file server where the key blob is stored is saying "Insert boot disk"?

      People think that disaster recovery is for "I deleted a folder".

      Disaster recovery is for disasters.

      Removing all Azure resources wipes everything. Your vNets... Poof! Your public IPs... Poof! Your internet-facing DNS zone... Poof! Your authentication credentials... Poof! Gone, gone, gone.

      How do you plan to restore dynamic IP addresses to their original values?

      How do you plan to restore DNS Zones that get assigned to 1 of 10 randomly selected server pools and hence have a 90% chance of requiring a change to the NS server glue records on restore?

      Do you even know which order things would have to be restored in to prevent failures during a restore?

      Could you possibly work out what is missing if you log on to your cloud portal and see the "Welcome to Azure, to get started click here" splash page?

      Get it?

      4 replies →

    • "We have 'Data gone? Sucks to be you!' as translated by our VC's lawyer buried in our T&Cs" -- most "disruptive startups", probably...

  • If you have a proper disaster recovery plan then yes. All of the configuration of the entire system should be documented at least, if not generated by version controlled code. Then the only thing that needs to be backed up is actually data storage on volumes with snapshots or block storage services.

  • Maybe not even malicious, maybe they just put in the wrong subscription ID :(

    • Yup.

      This thought occurred to me when I was testing a bulk resource creation script.

      My workflow in my lab tenant was:

      1) Bulk create hundreds of resources 2) Bulk wipe everything 3) Go to step #1

      Turned out, I had some objects with globally unique names that were now conflicting in the production tenant, so I had to wipe my lab.

      I had already logged on to the production tenant, and I was so "trigger happy" that I very nearly ran my bulk-erase script against the wrong subscription.

      It was a terrifying moment of clarity.

Dear customer,

This mail is a follow-up to the previous email we sent (on January 8th, 2020) on this topic. As a reminder, yesterday, we experienced an incident on a storage unit at our LU-BI1 datacenter, located in Luxembourg.

Despite the replication systems in place, and the combined efforts of our technical teams throughout the night, we were unable to reover the data that was lost on the impacted storage unit.

We sincerely apologize for the inconvenience that this situation has caused. This type of incident is extremely rare in the web hosting industry.

In the event that you have a backup of your data, we suggest that you to use it to recreate your server at a different datacenter.

To help you in this, we have provided you with a promo code that will give you one free month for an instance, so that you can create a new Simple Hosting instance in a different datacenter:

    XXX

  • Wow, for a company that boasts "no bullshit", only offering a month after destroying data and backups seems a little tone deaf

    Edit: in fairness, I'm not sure how exactly you would quantify such a loss anyway...

    • Reputable hosting providers typically don't try to quantify such a loss, but rather outright offer a credit/compensation that is very obviously generous (say, a year or even two of free service).

      Especially when a small set of your customerbase is affected, it won't cost you that much, and "overcompensating" like that means that virtually noone is going to criticize you for quantifying it wrong; instead, the public narrative will be centered around "well, shit happens, they did their best and generously compensated".

  • I could understand the incident (I would _at least_ start questioning myself about the quality of the service I'm paying), but IMHO this is not something that can be addressed with a casual e-mail that contains few lines of excuses and a "promo code" like it's everyday business. That's astonishing.

    Worse than a bad incident there is only bad management of the following situation.

  • > This type of incident is extremely rare in the web hosting industry.

    Why would they include that sentence? Are they trying to imply it is rare for them because it is rare for the industry? Are they saying they are not as good as the industry, so customers should move to other providers? Or are they trying to show they apply the same inattention to their customer communication as they apply to their data backup/recovery practices?

    This kind of data loss should simply never happen. It’s one thing to say “it will take us up to 30 days to restore your data because our fast recovery options aren’t working and we have to bring up cold archives”, it’s entirely another to say “your data is gone, tough”.

    • I'm not sure why you've been downvoted for this. I thought the same.

      I read it as: "This type of incident is extremely rare in the web hosting industry, because apparently the overwhelming majority of our competitors aren't capable of fucking up as badly as we just did."

      Doesn't inspire confidence at all, IMO.

    • > Why would they include that sentence?

      They're a French company; it may be a non-native speaker not catching the implication.

      It's also possibly an editing error, e.g. they started writing something like, "these types of incidents are extremely rare and when they happen etc" and most of it was dropped without considering how that changed the implication.

    • I think they're referring to the "incident" that they experienced (on the storage unit in the datacenter), not the situation as a whole. The implication is meant to be that they prepared for many things, but not something as unlikely as this.

    • I think it was meant to say "nobody is infallible", these events are extremely rare, but they /will/ occur, even if you're a customer of the best and biggest players.

  • > This type of incident is extremely rare in the web hosting industry.

    I read this as "so maybe you should consider one of the other web hosting companies that doesn't have problems like this."

  • Interesting. The public status page says they’re still waiting for the recovery process to complete.

  • Is this a response from the company or are you putting it forth as an example response for how to handle this incident better? It’s unclear from your post.

Looks like their backups only consisted of in-region backups on systems that were homogeneous. Common pitfall. While technically a 3-node distributed system may provide disaster recovery from one node failing, in practice, an accidental rm -rf from an ansible script targeting all three machines, or a bug in the software that's doing the replication, will leave you without a backup plan.

If you're in such a situation, The easiest is to do filesystem level backups with something like zfs and ship the backups to a third-party system that only has write/append-only semantics (better yet, use a write-once-read-many (WORM) disk to really guarantee it.).While there will still be _some_ data loss, it'll let you recover since the last snapshot.

If you don't have zfs, a database backup that runs the db dump script and scp/sftps it to a server running as a cronjob can also be an immediate remedy while you get your shit together (and by that I mean buy yourself a product with an immaculate reputation like aurora or cockroachdb to manage the db for you)

Harder but better would be to tee the log of the changestream (all distributed systems have such a log) to a third-party system. This is ideal because if it's done synchronously it'll let you recover since the last committed transaction.

And of course, test your backups, because backups are subject to code rot as well.

  • What backup strategy are you implying for the case of cockroachdb? Streaming the changefeed (including timestamps) to an external append-only system while slowly and incrementally iterating through all tables using as of system time to reduce impact on active transactions and know how late this shard of a "full backup" can be inserted into the "agumented" changefeed you'd generate by interleaving these shards into the changefeed. For replay you'd use the stream from the oldest shard up to the select min(a) from (select max(timestamp_resolved) as a from changefeeds group by table) newest timestamp you know you have the transactions complete changesets for (the resolved timestamp can be periodically emitted to confirm that no further records in the same feed(/table) could have a transaction timestamp earlier than it, inducing a partial ordering).

    You could replay the (combined,sorted,agumented) changefeed in-order, or shard it on the table's primary key to ensure per-key monotonicity when applying the streams in parallel threads/transactions/nodes.

Gandi have something of a cult following, but in my only experience with them they literally lost my domain name during an inbound transfer.

Their response was awful and rude and completely unprofessional. I never got my domain back.

Based on that experience, this incident doesn’t surprise me at all.

  • I’ve always been a little confused about their cult following given their unfriendly terms — arbitrary domain cancellation based on adult material for example — which are fair terms to have if that’s their ethics but it seems at odds with the typical pro freedom expectations many people in technology hold.

    • It was founded by pioneers of the Internet in France who where involved in non-profit/hacker/open source circles, which is where it got its cult following from.

      But at the end of the day it's a cheap provider with, ahem, French-style support so I'm not sure what people were expecting out of them.

    • Any details on this? All I found while searching for this was Gandi explicitly advertising gTLDs designed for adult content...

      Do they have that in their terms? Independently of that, do they have a history of doing that?

  • Why do they have a cult following? I never heard about them and reading all this here, I cannot say I understand why anyone uses them at all.

    Edit; I use (and have been for a very long time) namecheap for registration and (recently) Cloudflare for DNS. I used to host all DNS myself, but that became a bit of a pain with many domains as that's definitely not my core business.

  • I very recently transferred a few domains to Gandi, and they also managed to lose one. I had to contact their customer support and they were able to restore it - it was all very strange. Combined with this incident and their responses on social media I'm getting the feeling that I should move them elsewhere again...

  • what can you recommend as an alternative?

  • In the year 2020, it's becoming increasingly impossible to trust anyone to do nearly anything (in my opinion of course).

    The courts are too expensive. The culture of taking pride in one's work maybe is disappearing.

    For the most crucial parts of doing business/living life, we are required to trust someone else. For example, I can't just go and make my own cell phone tower or ICANN.

    And yet I can't even trust those entities to get it right.

    • Decreasing trust increases transaction costs.

      There's got to be a measurable (negative) economic impact.

I don't have hosting with Gandi, but I do use them for domains and DNS. I'll be considering migrating my domains from them after this.

Their response to this is exceptionally poor. To say essentially "this could happen to any other web host" it nonsense. I've never had this happen with any of the providers I've used for hosting and I'd be very angry if I had just lost an entire VPS. The fact that they've lost all snapshots as well (which are advertised as backups of the underlying volume) is unforgiveable.

  • I had an incident similar to this with linode, which is why I use and recommend Digital Ocean nowadays.

    My machine going away because you had hardware issues isn't my problem, and I'll spend my money on a more competent company.

    • I had the exact same experience on Digital Ocean. Attempted to resize a VPS, the process got stuck for eternity, and support tells me all data is lost.

      Always have your own offsite backups.

      1 reply →

    • When I worked at the WordPress hosting division of Copyblogger, we always had issues like these with Digital Ocean. They would email us saying that the node had a problem, and we had to recreate the server on our own.

      Good thing we only kept caching servers in Digital Ocean, so those were easily recreated, but that always kept me away from DO, personally.

      In fairness to them, though, DO do not claim to keep backup of the servers, as far as I know.

      1 reply →

  • I use Gandi for domains & DNS too. I've never had any problems so far but I don't want any surprises... Where do you want to migrate? What is a better alternative?

  • I've been burnt several times now by smaller players claiming a higher degree of privacy that suddenly charge high fees, sell to a competitor, or sell my data. As of last month, I've moved my domains to Google. Better the devil you know than the devil you don't.

Interesting reaction. Is the highly negative reaction correlated with US culture maybe ?

I've used them for many years and had several complex support interactions with them.

Their customer service policy is very "API-like" in that you get exactly the t&c you paid for and nothing more. Hand-holding and soothing noises are not included in the t&c. They fuck up you get a refund, you fuck up they'll tell you exactly that. Outside that they're very casual relaxed humans to communicate with.

I find that far more trustworthy (in the mathematical sense) than a "slick" twitter feed.

Politness does not imply trustworthiness.

Gandi is the absolute worst.

The last time I tried buying a domain through them, they took my money and then demanded "identification" via government ID (citing some bullshit in their ToS). I refused, so they closed my account and took the domain with them.

Based on that, I'm not surprised at all by their CEO's response to this incident[0]:

>If we led you to believe that you had nothing to do on your side when warned multiple times to make your back ups, then we'll have to make it clearer, and stop assuming that it's an industry wide knowledge.

[0]: https://twitter.com/StephanGandi/status/1215287619938062342?...

  • I had exactly this problem too but with NameCheap. Told them to put their id request and my money somewhere and left for Gandi.

    After more than 8 years with Gandi, not had a single issue with them.

I understand people might be upset because they lost data, but as a sysadmin, my reaction is "ooh shit, poor guys, that must be a horrible week"...

And honestly, if you don't keep data of stuff you host on a server provider like this, you kind of get what you deserve...

  • No you don't. While agree everyone should have their own backups, you should expect your hosting company to properly replicate and backup their datacenters.

    • I don't, actually, expect them to do so. But even if I would, and Gandi, here, were doing backups and replications, no one is immune from errors and catastrophes.

      Pretending that the cloud is permanent in infallible is extremely dangerous. I would seriously question the competence of any sysadmin relying on this as a base principle.

      Sure, they screwed up, but this stuff happens. We should actually be happy it happens "only" on a "small-ish" provider like Gandi and not an entire AZ at Amazon.

      Can't wait for that shoe to drop, I'll bring the popcorn, if there's anything left of civilization then...

      3 replies →

    • That is not the industry standard for web hosting. Never has been, never will be.

      Backups aren't free. Replication isn't free. DR isn't free. If a customer isn't paying a premium for them, they aren't getting them. Read the terms of service.

      11 replies →

  • The sysadmins over there probably have a whole list of stuff that should actually have been done, but management never gave them time to do. Then this happened and they were proven right. Their reward? Working a lot of overtime probably.

  • > And honestly, if you don't keep data of stuff you host on a server provider like this, you kind of get what you deserve...

    While I agree that everyone should have their own off-site backups, this does come across as incredibly crass victim blaming.

  • At least then now we know what kind of service we may expect from Gandi... Shit happens to everyone, it is in the cleaning up you learn who you're dealing with, is my personal view on that.

  • > You get what you deserve

    Sure, let's blame the victims here; that's effective and helpful.

    • culpability isn't zero-sum, everyone can have some. some entities deserve a lot, others deserve just a teeny tiny little bit.

      for purposes of keeping your data safe, your cloud provider is just one, single, copy of your data. all of their redundancies and backups and whatnot are for _their_ convenience, not yours, regardless of the marketing copy.

      (they can decide to intentionally delete your data because they think you didn't pay. no amount of RAID and georedundant backups on their part will help you then.)

    • Oh god, the victims, really? You host your data on someone else's computer to save on costs and get rid of the burden of dealing with metal and stabbing yourself with screwdrivers , and you're the victim when they fuckup?

      Give me a break... It's not like anyone died here. There's a reason I host my own shit. Problems happen, errors are made, and data is lost. It's also your responsibility to deal with data permanence, even if your provider has all the promises in the world.

      3 replies →

  • Shit does happen, but pretending like it's not a big deal and not providing a solid RCA seems to be what's really annoying about their reaction.

  • Even if it is a bad practice not to have your own backups, no one is at fault here but Gandi

Gandi is absolutely not the company I expected this to come from.

With that having been said, everyone please stop assuming your data is safe. It’s never safe, but it’s extremely not safe single homed somewhere. Make backups. Anything that’s saved locally on one machine only? Consider it gone until it’s backed up.

Cloud providers may be able to give you better assurances, but if you really care about data give it at least 2 independent homes. I’ve lost data more than I care to admit. BuyVM lost one of my VPSes years ago. Who’s fault was it really?

When you are ready to stop kidding yourself about your data, check out some backup solutions. I particularly like Borg Backup:

https://github.com/borgbackup/borg

And if you do not have network attached storage anywhere there are services that provide it as a service.

(Note: I think needless to say it’s also a good idea to back your NAS up to other places too, although I haven’t gotten into this practice yet. Synology supposedly has a lot of features around this.)

The key question is, did Gandi offer and explicit backup service for your data on their plans? I just had a look and I don't see this being offered.

As a former hosting engineer, at the risk of pissing on everyone's outrage parade, but unless an explicit guarantee of a backup is included in your plan's contract, or you can pay for backups as a bolt-on, then if you've lost data it's your fault for not planning for this scenario.

And I mean proper backups where you get, for example, twenty eight days of hourly backups and you can pick a specific version of file to recover in that 28 period. And where those backups are stored on different hardware or off-site. We offered this as a bolt-on (in-site and off-site). Tt was 20 quid a year for in-site, the off-site was a bit more. But a great many customers chose not to pay for this add-on, even despite the great big red bold warning text explaining that unless they paid for this add-on we made no guarantees about the permanence of their data in the event of a storage problem. Guess what....

Now that's not saying we didn't take snapshots of the hosting environment, but they were for internal use and to allow us to recover quickly in the event of something unexpected going wrong, but now and again stuff breaks.

Sure, it's unfortunate some lump of storage hardware has failed and whatever mirrors they may have had have been taken out as well. They possibly could have done better but shit happens sometime.

You shouldn't rely on an "implied backup" from your service provider, if you want that then you're going to be paying a shedload more for hosting your Wordpress and Woocommerce site. It's up to you to make sure absolutely sure your data is safe if it's critical to the day-to-day running of your business.

Edit: ok, so this is tucked away in their docs (thanks to itake below):

https://docs.gandi.net/en/simple_hosting/common_operations/s...

But it does say:

> Snapshots do not make a backup of your databases. If you would like to perform a backup of your databases, we recommend you perform an export, or launch a dump script via crontab.

The bottom line...is it guaranteed in your contract? Always check. And as per my follow up comment, those plan prices are are just too cheap for that facility to be taken seriously for business continuity. They're a convenience to quickly recover a version of a file, not a serious backup.

  • > Easily recover backups of previous versions of your website's files, thanks to our automatic Snapshots system. It's free!

    https://docs.gandi.net/en/simple_hosting/common_operations/s...

    They are supposed to be providing backups.

    • I believe nobody should count on backups provided by the product that stores your data.

      There are different kinds of backups here:

      * the ones that are part of the offer, where the provider gives you a convenient way to recover from your mistakes, this is a feature they provide when their services are operational (in this case, the snapshots feature).

      * the ones they put in place to mitigate incidents and maintain their SLOs. If you accidentally delete a file, you don't have access to them, they are useless to you. These backups are a mean to reach their service level objectives. Nobody can offer you 100% guarantee that they won't lose your data in an SLO. If someones promises you this, just... don't believe it.

      (edit: formatting, typo, mention snapshots in case 1)

    • Key question, is it guaranteed in your contract?

      Also:

      > Snapshots do not make a backup of your databases. If you would like to perform a backup of your databases, we recommend you perform an export, or launch a dump script via crontab.

      For those plan prices if I was running anything mission critical there then I'd be making darned sure I was squirting copies of my site's dynamic data to somewhere else on a regular basis (and you should also be able to re-deploy your code from local). Even if there was a guarantee, I'd still have a backstop in place. Never underestimate the chance of a good cockup.

    • The thing is, backup can mean different things. Those free/cheap "backup" snapshot things are obviously the equivalent of vim backup files. It's useful if you screw up and want to revert two hours later.

      You have to be foolish to assume you get proper, actual backups for the price of a Coca-Cola can.

  • "unless they paid for this add-on we made no guarantees about the permanence of their data in the event of a storage problem"

    To be honest this is not a good way to do hosting business. If you provide a service called "Simple Hosting", putting backup requirement on customer (when it is your fault) is pretty unfair.

    PS: I think price of the product shouldn't effect minimum requirements.

    • > putting backup requirement on customer (when it is your fault) is pretty unfair.

      Then they need to pay more for their "Simple" hosting.

      > I think price of the product shouldn't effect minimum requirements.

      See above. Sigh.

      4 replies →

I'm a long term user of Gandi for my domains but have wanted to get off them for some time now.

Can anyone recommend a domain registrar "equivalent" of a Fastmail or Letsencrypt or DNSMadeEasy i.e. truly no bullshit, geek friendly and polished at the same time ?

I'm not too bothered about price. I just want a well run outfit that has a wide selection of TLDs and ccTLDs (and ideally isn't a mega corp like google but is big enough that I don't have to worry about them disappearing overnight).

Azure Shared Responsibilities [0]

AWS Shared Responsibilities [1]

Flipping a switch that says "Backup" does not mean you are handing your responsibility to them. At most, they will fail to meet their SLA, write you a check for according to the TOS and be done with it. At best, you'll be able to bitch about it on Twitter, possibly threaten a lawsuit (you read the ToS?) and still be in the same position because you did not share the responsibility of securing your data.

[0] https://docs.microsoft.com/en-us/azure/security/fundamentals...

[1] https://aws.amazon.com/compliance/shared-responsibility-mode...

> We sincerely apologize for the inconvenience that this situation has caused. This type of incident is extremely rare in the web hosting industry.

Why are they speaking of the "industry" as a whole when they are to blame?

It's even crazier they are not even explaining the source of the data loss and why the "replication systems" didn't help.

IHMO they are trying to sweep this event under the carpet. They should instead explain why they should be trusted in the future and why this would not occur again.

  • yeah, I worked for several years at a hosting provider, and I can tell you for a fact that this wouldn't have happened there.

    They're 100% virtualized and keep backups of all those machines. In addition, you can purchase a package so that YOUR backups are automatically backed up to 2 different datacenters. Between the two of those solutions, there would be a way forward.

    I don't really know what Gandi is so I can't speak to them directly, but this is a solvable problem.

  • Replication is explicitly designed to correct issues associated with bitrot and add redundancy.

  • Replication is not a backup as was already mentioned. A great example of this is when the KDE project almost lost all of their Git repos because they were mirroring a corrupted copy of the data. https://www.phoronix.com/scan.php?page=news_item&px=MTMzNTc

    • Fortunately, git is a DVCS, so anyone who checks out a repo has a complete copy of it.

      Now, granted, it'd be a huge pain to track down all the people who had copies of the 1,500 different repos, and try to find as up-to-date as possible of a version of each, but I doubt they got anywhere close to potentially losing all their source code.

      Incidentally this shows why it's a good idea to sync your repo to GitHub, even if the canonical repo is elsewhere: in addition to the usual reasons of incentivizing some contributors by giving them "GitHub credit", and increasing visibility of your project's code, GitHub can serve as a backup!

      Also, on a side-note, 1,500 separate repositories?! That sounds way overkill. I wonder if they'd benefit from having a monorepo.

      2 replies →

  • In the last few years, I have seen many people confuse replication with backups. People see them as the same thing, but they really aren't. Even with snapshots, if the devices are the same, they might have the same firmware bug, etc.

    • Just to further explore that a bit, would you say replication adds independent copies for failures of media, while backup adds copies made by independent software against failures of process / software / media.

      2 replies →

The "no bullshit" motto is mentioned a few times here. A motto is just another marketing device ­– a way for a company to pretend to have any sort of principles beyond making as much money as fast as possible.

Why would anyone believe a motto is anything other than a marketing device? It is only believable if people follow it contrary to pragmatism. Any company is eventually going to have a fair share of people who believe being pragmatic is more important than their motto. And in Western culture at least it's usually considered rude to bring up the "big guns" and have a fundamental values discussion when everybody just wants the meetings to end and to start making more money.

  • In fact a motto is usually chosen to cover a weak spot. So "No Bullshit" reads to me as "We're Kinda Cowboys". "We Care" - "People Know We Don't care".

    Fujitsu – “The possibilities are infinite” ... "The Ways in Which we can Screw this Up Are Infinite"

    Intel – “Leap Ahead” and “Sponsors of Tomorrow”. "We've got to protect our entrenched position".

    LG – “Life's Good”. "Life is Actually Objectively Bad".

    Google - "Don't be Evil". "How We Actually Make Money is Evil But Our Mission is Good".

shit happens...but the way their philosophy is fine tuned makes me wonder..

Above all, "no bullshit" is our golden rule—to treat our users how we want to be treated. It's a promise to respect your rights and to level with you about our shortcomings.

https://www.gandi.net/en-US/no-bullshit

ex: https://twitter.com/andreaganduglia/status/12151991477012316... (thanks op)

We will listen to you, and be honest in our replies, even if it means you won’t always like what we say.

  • > We will listen to you, and be honest in our replies, even if it means you won’t always like what we say.

    They are actively treating their customers like shit, and that tone starts at the top. No bullshit does not give creative license to be assholes to people that are panicked because of something you directly caused.

I hate to say it but Gandi seems like they’re in a quality freefall. I had a domain there a year or two back because they were one of the only registrars that supported that particular extension... and man, so many problems just with simple tasks like updating the WHOIS info and credit card for renewal.

This is basic stuff.

I had a co-worker who was super chill during outages; especially at night, we were 10-15 people on the call fixing issues related to his work almost monthly.

those outages costed millions of euros, and he never picked up his phone at night, once I asked him why he never picks up, he told me:

"I used to be a general surgeon, when someone calls me people die. Relax, nobody is dying during our outages."

now I think I am taking myself(and my work) too seriously.

  • It's likely the only way to stay sane in a corporate environment. The problem is, you don't need much people practicing that, before it drags everyone down to the same niveau. You can choose to try to continue your quest into doing work seriously (this will likely drive you insane over the years), or to join in that kind of negligence (goodbye spine), or to quit. In the end it got us where we are now, a world filled with fake companies selling their fake little products as they were qualitative, and making a game of disrespecting their own customers. Pure facade, been there...

    I've done it before, but I'll recommend to you Scott Adams' book, The Dilbert Principle for some light reading about forces like that at work.

  • Given that the statistical economic value of human life is around $7m, maybe he should start picking up his phone.

  • "I used to be a general surgeon, when someone calls me people die."

    Hence he's not a surgeon anymore.

From the incident timeline:

> we have a problem to import zfs pool on the unit storage

I really want to know what went wrong to a) break ZFS b) prevent recovery from backup.

  • Re myself, it looks like they're using FreeBSD-based ZFS filers with iSCSI/NFS exports using a user-spce NFS server:

    * https://news.gandi.net/en/2019/09/exporters-detect-micro-inc...

    > Gandi’s storage infrastructure consists of two environments: one for IaaS and one for PaaS. Both are based on FreeBSD-based storage units (filers), that stock each volume (disk) as though it were a ZFS volume.

    * https://news.gandi.net/en/2019/03/tracking-a-storage-issue-l...

    * https://www.bsdcan.org/2016/schedule/attachments/351_FreeBSD...

    No mention of what they're doing for backups / "replication systems", unsurprisingly/unfortunately. I'm anxious to know what the failure mode for `zfs send | zfs receive` replication is here?

    • Sounds very much like they weren't doing zfs send | zfs receive to anything sufficiently physically separated. For example, if you send and receive in the same pool, it's replication but still leaves you vulnerable to issues where the pool can't be imported due to corruption in the wrong places (it can happen) or significant hardware failure (eg a PSU fault that takes out too many of the drives in the pool).

last update says they were able to restore a version

Updated on Thursday, 9:58 PM +0200: we're not sure we will be able to provide the data but we were able to recover a version of the filesystem from right before the crash

I have been using them for DNS and some minor hosting for a long time and I will stay with them. I think it's important to avoid the monoculture/centralisation which is otherwise happening.

Sure Gandi has their flaws, they are humans.

I expect they do will a proper post-mortem on what went wrong and how they managed to fix it. Seems they were using ZFS and relied on it a bit too much. Or if they indeed managed to restore the last snapshot, then their only error might have been the classic one of underestimating how long restoring/investigating several terabytes take even on modern HW.

There have been a number of threads suggesting places like Gandi over AWS because they are so much cheaper. I've always been skeptical about building key apps on these types of places but folks INSIST it's the right choice.

3TB at Gandi costs $6 + you get compute with it. 3TB of bandwidth at AWS might be $270.

Has anyone tried this instead of using cloudfront etc? Get 100 $6 hosts and pump out content for your ipv6 connecting clients etc?

This seems like data has been lost from servers hosting sites/services.

Since Gandi is mostly known for domain registrations and DNS, I'm curious if you (as an individual who hosts websites/online services somewhere on the web) backup your site's DNS records periodically (or whenever they're changed). What if your authoritative name server lost data and all the caches of those records across geographies expire while you're asleep/away? If you do back these up regularly, how do you do it in an automated way on a *nix system? I found this article [1] when I searched about this, but it's not a simple shell script. The scripts that I did find on some of the Stackexchange sites seemed to have specific subdomain names hardcoded.

[1]: http://www.programblings.com/2012/07/23/do-you-back-up-your-...

  • Gandi has an API and lets you download entire zonefiles if they host your DNS.

I helped co-found a large Dropbox-like white label product. We used AWS and especially s3 for storage.

After many many years of experience with systems, I made sure we had as many possible ways to recover user data as we could. The initial solution was a large Postgres database for all the metadata/indices and s3 for the actual storage.

Despite much pushback we built in little things like an individual meta file on the file system for each file we stored. That way, if we lost the Postgres dB for any reason, we could create a script to rebuild the dB and restore access avoiding massive counts of orphaned files. A simple and probably stupid solution but...

Well guess what - the DB got corrupted and after some ado, we restored all access and none of our customers lost anything.

No it’s not full backups but...

The problem with cheap hosting is they want to use backups as an upsell, but you should still have backups to cover the companies ass even if the customer doesn’t get to use them. 123 Reg lost a load of customers VPS’s a year or two ago also thanks to a faulty script.

  • Not excusing them but..

    with modern container hosting you really should be able to make your own. even with cheap VPS hosting. There is no reason to live in a world where a server goes down you lose anything anymore.

As a customer, I only have good things to say about Gandi.net but have to admit this is subpar customer communication right there.

Lost 3 sites built with WordPress. Will rebuild as static sites repo separate from host, no more database, lesson learned.

In the world of cloud, this should be pretty trivial. Upload your daily dumps / asset metadata to S3/GCS/ABS.

Set a retention policy I.e even if someone ran some delete command it wouldn’t delete. Someone with retention lock permissions is the only one that can remove the locks And delete.

There is cold storage and other things even cheaper. But cloud object prices are pretty cheap per GB it’s ridiculous.

I think they make most of the margins on bandwidth.

Losing customer data. All customer data is pretty ridiculous. I can understand downtime. I can understand losing a day of changes. But everything? That’s just unacceptable business.

I have a custom domain with Gandi and take advantage of their mail forwarding option to forward the emails sent to the custom domain (my “no lock-in” email address) to my personal Gmail account.

Considering how critical email is for me, seems like I won’t be trusting their MX servers to process all my inbound mail anymore and will soon be looking for another solution that works well with Gmail (don’t want to pay for GSuite), and possibly also transfer my domain to another registrar.

That support tweet is such bad taste.

  • gandi's response notwithstanding: email is hardly reliable

    unless you run all the MXes (and can prove otherwise): you're likely having emails dropped all the time already

    • For sure, but everything is relative. Ignoring for a second the lock-in factor of using a @gmail.com address, I would trust Google's MX servers any day over Gandi's, especially after this last incident (trust == reliability in this context).

      3 replies →

This is like living in an alternate universe, I've been heavily involved in all things programming and webdev for years, following trends and whatnot and it is literally the first time I'm hearing of this particular company. What is (was?) so special about them that they attracted the HN crowd can someone briefly explain? Why would I buy domain from them when something like namecheap, even google domains exists? Why would I even host something there?

  • > Why would I buy domain from them when something like namecheap, even google domains exists? Why would I even host something there?

    If you're in Europe, they're cheap for many European countries' domains.

    Back 10-15 years ago they were special because it felt like a hacker kind of company. They gave free WHOIS privacy, what seemed like good DNS control/UI at the time. But it was the WHOIS privacy that got me onto them.

    I still use them because they're around half the price for .co.uk than many registrars - and many others I've used have become more rubbish than Gandi has.

    All my DNS is hosted elsewhere now, and I never understood why Gandi introduced hosting et al. I've never used it and never would, it seemed a terrible diversification for a good domain registrar.

    • Can you explain how you not using their hosting makes it a terrible diversification?

      I think majority of people buy domains for hosting websites so it makes sense they would want to setup one using one click WordPress or something similar.

      1 reply →

  • I wrote out a list of things I needed in a domain registrar and once you include U2F logons and DNSSEC support, you find yourself in a very limited space.

I have about a hundred of domains registered at gandi, I used to like the formed management interface, but I really hate the new one.

Is there a registrar you would recommend as an alternative, I don't need DNS, nameservers and glue records and I'm ok.

The main selling points are stability, transparency and simplicity. I don't care if it's not the cheapest.

Gandi is never really impressive, but they're one of the few registrars where I can get .af domains without a hassle.

All my domains are registered through Gandi. What good registrars would you suggest? I'd like to move them out.

I have some domains here, mostly secondary domains to not have all my eggs in my namecheap basket (e.g. if anything happens to namecheap or my namecheap account).

Will likely transfer those to elsewhere after this. Probably Name.com, I guess.

(1h22m before this comment)

> Updated on Thursday, 9:58 PM +0200:

> we're not sure we will be able to provide the data but we were able to recover a version of the filesystem from right before the crash

Maybe it's not all gone.

The assessment is taking a long time because there are several TB of data on the filer

Is that a lot of data? That sounds like a very small filer that could have easily been backed up.

Regardless of any of the technical aspects of this disaster, the attitude of the company and its customer service means I will be staying far away from them.

For not the first time I'm left thinking that the "big filer" model is not such a great idea :(

Does anyone know what Gandi is using as a “filer”?

  • ZFS by the looks of the status updates:

    >we have a problem to import zfs pool on the unit storage. Our engineers are still working on it.

There s an upside to knowing your data is not backed up somewhere, and that when you delete them they 're really lost. They should offer that as privacy-conscious hosting.

Hey Guys sorry for being a philistine about this but does that mean we have lost our domain name and how can we migrate it to another hosting platform? Cheers

  • If you only had a domain you're likely not affected (metadata). If you also have a website hosted there (data), you may be.

Just a quick question, how do I transfer my site adress to another hosting company, is that too is lost? Sorry for being a philistine about this... Cheers