Comment by Johnny555

9 days ago

I think that alluded to that earlier in the article:

>However, due to the system’s large-capacity, low-performance storage structure, no external backups were maintained — meaning all data has been permanently lost.

I think they decided that their storage was too slow to allow backups?

Seems hard to believe that they couldn't manage any backups... other sources said they had around 900TB of storage. An LTO-9 tape drive holds ~20TB uncompressed, so they could have backed up the entire system with 45 tapes. At 300MB/sec with a single drive, it would take them a month to complete a full backup, so seems like even a slow storage system should be able to keep up with that rate. They'd have a backup that's always a month out of date, but that seems better than no backup at all.

Too slow to allow batched backups. Which means you should just make redundant copies at the time of the initial save. Encrypt a copy and send it offsite immediately.

If your storage performance is low then you don't need fat pipes to your external provider either.

They either built this too quickly or there was too much industry corruption perverting the process and the government bought an off the shelf solution that was inadequate for their actual needs.

Let's run the numbers:

LTO-9 ~$92/tape in bulk. A 4 drive library with 80 drive capacity costs ~$40k* and can sustain about 1 Gbps. It also needs someone to barcode, inventory, and swap tapes once a week and an off-site vaulting provider like Iron Mountain. That's another $100k/year. Also, that tape library will need to be replaced every 4-7 years, so say 6 years. And those tapes wear out over X uses and sometimes go bad too. It might also require buying a server and/or backup/DR software too. Furthermore, a fire-rated data safe is recommended for about 1-2 weeks' worth of backups and spare media. Budget at least $200k/year for off-site tape backups for a minimal operation. (Let me tell you about the pains of self-destructing SSL2020 AIT-2 Sony drives.)

If backups for other critical services and this were combined, it would probably be cheaper to scale this kind of service rather reinventing the wheel for just one use-case in one department. That would allow for possibly multiple types of optimizations like network-based backups to nearline storage to then be streamed more directly to tape and using many more tape drives, possibly a tape silo robot(s) and perhaps split into 2-3 backup locations obviating the need for off-site vaulting.

Furthermore, it might be simpler, although more expensive, to operate another hot-/warm-site for backups and temporary business continuity restoration using a pile of HDDs and a network connection that's probably faster than that tape library. (Use backups, not replication because replication of errors to other sites is fail.)

Or the easiest option is to use one or more cloud vendors for even more $$$ (build vs. buy tradeoff).

* Traditionally (~20 years ago), enterprise "retail" prices of gear was sold at around 100% markup allowing for up to around 50% discount when negotiated in large orders. Enterprise gear also had a lifecycle of around 4.5 years while it still might technically work, there wouldn't be vendor support or replacements for them, and so enterprise customers are locked into perpetual planned obsolescence consumption cycles.

  • $500K/year to back up a system used by 750,000 people is $0.66/year. Practically free.

    At least now they see the true cost of not having any off site backups. It's a lot more than $0.66 per user.