Comment by jordanbeiber

6 years ago

ZFS, on Solaris, not robust?

ZFS for “play”?!

This... is just plain uninformed.

Not just me and my employer, but many (many) others rely on ZFS for critical production storage, and have done so for many years.

It’s actually very robust on Linux as well - considering the fact that freeBSD have started to use the ZoL code base is quite telling.

Would freeBSD also be in the “play” and “not robust” category as well, hanging out together with Solaris?

Will it perform better than all in terms of writes/s? Most likely not - although by staying away from de-dup, enough RAM and adhere the pretty much general recommendation to use mirror vdevs only in your pools, it can be competitive.

Something solid with data integrity guarantees? You can’t beat ZFS, imo.

> Something solid with data integrity guarantees? You can’t beat ZFS, imo.

This reminds me. We had one file server used mostly for package installs that used ZFS for storage. One day our java package stops installing. The package had become corrupt. So I force a manual ZFS scrub. No dice. Ok fine I’ll just replace the package. It seems to work but the next day it’s corrupt again. Weird. Ok I’ll download the package directly from Oracle again. The next day again it’s corrupt. I download a slightly different version. No problems. I grab the previous problematic package and put it in a different directory (with no other copies on the file system) - again it becomes corrupt.

There was something specific about the java package that ZFS just thought it needed to “fix”. If I had to guess it was getting the file hash confused. I’m pretty sure we had dedupe turned on so that may have factored into it.

Anyway that’s the first and only time I’ve seen a file system munge up a regular file for no reason - and it was on ZFS.

Performance wasn't robust, especially on dead disks and rebuilds, but also on pools with many (>100) filesystems or snapshots. Performance would often degrade heavily and unpredictably on such occasions. We didn't loose data more often than with other systems.

"play" comes from my distinct impression that the most vocal ZFS proponents are hobbyists and admins herding their pet servers (as opposed to cattle). ZFS comes at low/no cost nowadays and is easy to use, therefore ideal in this world.

  • Fair enough, I can’t argue with your personal experience, but I can assure you that ZFS is used ”for real” at many shops.

    I’ve only used zfs in two or three way mirror setup, on beefy boxes, where the issues you describe are minimal. Also JBOD only.

    The thing is that without checksumming you’ve actually no idea if you lose data. I’ve had several pools over the years report automatic resilvering on checksum mismatches. Usually it’s been disks acting up well before smart can tell, and reporting this has been invaluable.