Comment by kadoban
3 years ago
If you keep a ZFS mirror of the most-recent N drives, and store them separately, that should be pretty good.
I forget how ZFS behaves if a mirror is missing drives though, if some are off-site. Hopefully it's smart enough to let you do that and just rotate through.
ZFS doesn’t handle that well generally.
You’d have better luck making a full (manual) copy most likely (ZFS send/recv, or mirror then un-mirror even better), assuming you’d run a scrub after.
Or manually make checksum files I guess. I’ve done that, less ‘magic’ that way.
> ZFS doesn’t handle that well generally.
Can you expand on that?
The purpose and benefit of a zfs mirror is that every disk in the mirror contains everything. So it's expensive in space usage, but great for reliability. As long as any one of the disks in a mirror survives, you can recover everything.
The issue is that a mirror in ZFS is assumed to be ‘live’, unless you explicitly sever the relationship (unmirror it). Having tons of unreachable devices (because you have them sitting on a shelf somewhere) makes things unhappy operationally.
So if you have one ‘live’ copy, create a mirror, then sever the relationship before taking the external device offline, it’s fine.
Even taking it offline sometimes when it’s a normal live mirror is fine (though it will complain of course, depending on how it’s done).
But if you want to make copies, so add a bunch of mirrors, taking them offline ‘forever’ (really over multiple boot cycles) it makes the running ZFS instance angry because it expects those mirror copies to still be accessible somewhere (after all, ZFS is a live filesystem) and will keep trying to use/access them, and won’t be able to.
I don’t think you’ll lose data or brick anything, but it will be a hassle at some point.
Also, if you reconnect those old instances, it will try to resilver the older copies (and hence modify them).
Which is not what I would want for an archive, unless I manually told it to do so anyway.
Which is easy enough to do of course even after severing the mirror relationship later, albeit with more disk I/O.
I’ve done this kind of archiving before, there is built in ZFS support for making a new zpool when splitting mirrors this way, and it works well.
The way I ended up doing it was primary/secondary disks (both external hard drives).
Setup as a mirror, copy archive data over. Split the mirror, now you have two (slightly differently named) unmirrored zpools with identical data that can both be mounted at the same time, scrubbed independently, etc.
Having one of them ‘live’ and the other one the archived copy (on the external disk) would be trivial, and allows you to zpool export the archived/external copy, name it with the date it was made, etc. - which is what you want to make everyone happy.
P.S. if doing this, be REALLY careful about what kernel/OS/ZFS features you are using with your pools or you can end up with an unmountable ZFS copy! (As I found out). Built-in ZFS encryption is a major footgun here. Better to use dmcrypt style encryption for several reasons.
1 reply →
If you offline or detach the drive, it should be fine afaik. Detach is probably the better way to go though, because it removes the drive from the status listing.
You can’t ZFS import a pool on a drive detached this way without things happening to that drive, is the issue. So if you want a point in time archive, it’s dangerous to do that.
I think it’s nearly impossible to actually ZFS import a pool from just a single detached drive too if the original is nuked, but I imagine there is some special magic operation that might make it possible.
Splitting the new disk off into it’s own pool doesn’t have any of these issues, and is a much cleaner way to handle it.