Comment by jiggawatts

6 years ago

Just to play devil's advocate: This is in no way different to how Azure, AWS, and GCP operate. They don't have backups either. They too rely on n-way replication, a bit like a distributed RAID.

All cloud providers make it absolutely clear, in black & white, that protection of your data is your responsibility, not theirs.

What I find hilarious is that most cloud providers only provide built-in backup functionality for a tiny subset of their services.

Ask Microsoft if you they have a "backup" button for Azure DNS Zones. Or Azure load balancers. Or anything else that isn't a VM disk, App Service, SQL Database, or a Secrets Vault.

I mean, look at this insanity: https://docs.microsoft.com/en-us/azure/backup/backup-azure-f...

"Backup for Azure file shares is in Preview."

After 10 years of operation, this trillion-dollar company has only a use-at-your-own-risk beta for data protection!

Don't be too hasty to point fingers at Ghandi and laugh about how they're unprofessional. Whatever you're using is essentially the same.

Ask yourself this: Could your organisation recover if some malicious admin simply deleted all Azure Resource Manager resources in one go using PowerShell?

17 comments

jiggawatts

bscphil 6 years ago

Everything you say here is true, but at the same time it's just a fact that Gandi lost a lot of customers' data, and AWS, GCP, and Azure have never (as far as I know) lost a significant amount of it at once. You can talk about theoretical responsibility for data, and it's true, you are responsible for having backups of your data, no matter how many "9s" the service has, but the basic fact is that some services have been consistently good at not losing customer data, and others haven't. Even though I'm going to back up my data no matter where it is, I'd still rather use the service that's got a better track record with it.

I haven't ever even lost a file on Google Drive, which as far as I know provides no reliability guarantees at all.

jiggawatts 6 years ago
Back in the early days GMail lost customer data due to storage corruption. It has happened.
The rarity is immaterial, the responsibility for data protection lies with you, not them.
- bscphil 6 years ago
  
  > The rarity is immaterial
  Of course it's material. If a provider has a 0.001% chance of losing some of my data in a year, I'm an idiot for not having backups. If a provider has a 10% chance of losing some of my data in a year, I'm an idiot for not having backups and for using that provider.
  GMail is (usually) not an enterprise product and not a paid service, and provides no reliability guarantees. And yet it seems to be pretty damn good in practice.
- glenneroo 6 years ago
  
  Well to be fair Gmail was still in Beta ;)
  
  1 reply →

jonas21 6 years ago

That's kind of like saying there's no difference in safety between an airliner and the winged contraption that my idiot brother built in his garage.

After all, they both have wings and will both kill you if they fall out of the sky, and I don't see Airbus or Boeing guaranteeing that their planes will never crash, so they must be essentially the same.

Polylactic_acid 6 years ago

>Boeing
That just confirms the parent comment

yjftsjthsd-h 6 years ago

> Ask yourself this: Could your organisation recover if some malicious admin simply deleted all Azure Resource Manager resources in one go using PowerShell?

We have streaming replicas for hot data AND regular snapshots shipped to offsite cold storage, because RAID is not a backup. If we experienced an equivalent event, we'd be fine.

jiggawatts 6 years ago
The equivalent scenario to recovering from a bulk erasure of all Azure RM resources is this:
How long will it take you to recover if someone deleted your switch configs, reset the SAN to factory defaults, wiped you firewall rules, deleted you Active Directory accounts (or equivalent), and then ran a secure erase on every every physical server just to raze everything to the ground and salt the earth?
I mean in wall-clock time, how long would it take your team to even figure out what is going on? Where would you start?
Would you recover the switch first, or the server that you use to authenticate to it using RADIUS or LDAP?
How will you securely connect to servers if your CRL and OCSP servers are down?
How will you get access to your passwords if your file server where the key blob is stored is saying "Insert boot disk"?
People think that disaster recovery is for "I deleted a folder".
Disaster recovery is for disasters.
Removing all Azure resources wipes everything. Your vNets... Poof! Your public IPs... Poof! Your internet-facing DNS zone... Poof! Your authentication credentials... Poof! Gone, gone, gone.
How do you plan to restore dynamic IP addresses to their original values?
How do you plan to restore DNS Zones that get assigned to 1 of 10 randomly selected server pools and hence have a 90% chance of requiring a change to the NS server glue records on restore?
Do you even know which order things would have to be restored in to prevent failures during a restore?
Could you possibly work out what is missing if you log on to your cloud portal and see the "Welcome to Azure, to get started click here" splash page?
Get it?
- smnrchrds 6 years ago
  
  > The equivalent scenario to recovering from a bulk erasure of all Azure RM resources is this
  It just occurred to me how much easier it is to wipe everything in the cloud age than the on-prem age. Doing all the things you said for on-prem takes some serious effort. Some, like factory resets, may be impossible without individual physical access. You would probably be discovered and stopped before you can inflict much damage. In the cloud age however, it takes orders of magnitude less time and effort to inflict the same damage.
  It is kinda like how much easier it is to steal data now. Before the digital age, stealing as much data as Equifax hack would have required moving truckloads of paper without being discovered. It was simply impossible to pull it off in reality. In the digital age, however, we have accepted massive data leaks as not only possible, but unavoidable.
  
  1 reply →
- yjftsjthsd-h 6 years ago
  
  I think you're moving the goalposts. Gandi didn't lose all their servers and all the networking hardware and all the storage. They lost what sounds like a single replicated volume. If, y'know, all of their datacenters burned down at once, or an attacker got access and deleted their PaaS account, I think we'd all be a lot more sympathetic
  
  1 reply →
bigiain 6 years ago

"We have 'Data gone? Sucks to be you!' as translated by our VC's lawyer buried in our T&Cs" -- most "disruptive startups", probably...

singlow 6 years ago

If you have a proper disaster recovery plan then yes. All of the configuration of the entire system should be documented at least, if not generated by version controlled code. Then the only thing that needs to be backed up is actually data storage on volumes with snapshots or block storage services.

teddyuk 6 years ago

Maybe not even malicious, maybe they just put in the wrong subscription ID :(

jiggawatts 6 years ago

Yup.
This thought occurred to me when I was testing a bulk resource creation script.
My workflow in my lab tenant was:
1) Bulk create hundreds of resources 2) Bulk wipe everything 3) Go to step #1
Turned out, I had some objects with globally unique names that were now conflicting in the production tenant, so I had to wipe my lab.
I had already logged on to the production tenant, and I was so "trigger happy" that I very nearly ran my bulk-erase script against the wrong subscription.
It was a terrifying moment of clarity.