← Back to context

Comment by GuB-42

5 days ago

To be honest, the situation with Linux is barely better.

ZFS has license issues with Linux, preventing full integration, and Btrfs is 15 years in the making and still doesn't match ZFS in features and stability.

Most Linux distros still use ext4 by default, which is 19 years old, but ext4 is little more than a series of extensions on top of ext2, which is the same age as NTFS.

In all fairness, there are few OS components that are as critical as the filesystem, and many wouldn't touch filesystems that have less than a decade of proven track record in production.

ZFS might be better then any other FS on Linux (I don't judge that).

But you must admit that the situation on Linux is quite better then on Windows. Linux has so many FS in main branch. There is a lot of development. BTRFS had a rocky start, but it got better.

I’m interested to know what ‘full integration’ does look like, I use ZFS in Proxmox (Debian-based) and it’s really great and super solid, but I haven’t used ZFS in more vanilla Linux distros. Does Proxmox have things that regular Linux is missing out on, or are there shortcomings and things I just don’t realise about Proxmox?

  • The difference is that the ZFS kernel module is included by default with Proxmox, whereas with e.g. Debian, you would need to install it manually.

  • You probably don’t realise how important encryption is.

    It’s still not supported by Proxmox, yes, you can do it yourself somehow but you are alone then and miss features and people report problems with double or triple file system layers.

    I do not understand how they have not encryption out of the box, this seems to be a problem.

as far as stability goes, btrfs is used by meta, synology and many others, so I wouldn't say it's not stable, but some features are lacking

  • My understanding is that single-disk btrfs is good, but raid is decidedly dodgy; https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#raid5... states that:

    > The RAID56 feature provides striping and parity over several devices, same as the traditional RAID5/6.

    > There are some implementation and design deficiencies that make it unreliable for some corner cases and *the feature should not be used in production, only for evaluation or testing*.

    > The power failure safety for metadata with RAID56 is not 100%.

    I have personally been bitten once (about 10 years ago) by btrfs just failing horribly on a single desktop drive. I've used either mdadm + ext4 (for /) or zfs (for large /data mounts) ever since. Zfs is fantastic and I genuinely don't understand why it's not used more widely.

    • One problem with your setup is that ZFS by design can't use a traditional *nix filesystem buffer cache. Instead it has to use its own ARC (adaptive replacement cache) with end-to-end checksumming, transparent compression, and copy-on-write semantics. This can lead to annoying performance problems when the two types of file system caches contest for available memory. There is a back pressure mechanism, but it effectively pauses other writes while evicting dirty cache entries to release memory.

      5 replies →

    • I was assuming OP wants to highlight filesystem use on a workstation/desktop, not for a file server/NAS. I had similar experience decade ago, but these days single drives just work, same with mirroring. For such setups btrfs should be stable. I've never seen a workstation with raid5/6 setup. Secondly, filesystems and volume managers are something else, even if e.g. btrfs and ZFS are essentialy both.

      For a NAS setup I would still prefer ZFS with truenas scale (or proxmox if virtualization is needed), just because all these scenarios are supported as well. And as far as ZFS goes, encryption is still something I am not sure about especially since I want to use snapshots sending those as a backup to remote machine.

    • > I have personally been bitten once (about 10 years ago) by btrfs just failing horribly on a single desktop drive.

      Me, too. The drive was unrecoverable. I had to reinstall from scratch.

  • I'm similar to some other people here, I guess once they've been bitten by data loss due to btrfs, it's difficult to advocate for it.

    • I am assuming almost everybody at some point experienced data loss because they pulled out a flash drive too early. Is it safe to assume that we stopped using flash drives because of it?

      2 replies →

  • Do Synology actually use the multi-device options of btrfs, or are they using linux softraid + lvm underneath?

    I know Synology Hybrid RAID is a clever use of LVM + MD raid, for example.

> Btrfs [...] still doesn't match ZFS in features [...]

Isn't the feature in question (array expansion) precisely one which btrfs already had for a long time? Does ZFS have the opposite feature (shrinking the array), which AFAIK btrfs also already had for a long time?

(And there's one feature which is important to many, "being in the upstream Linux kernel", that ZFS most likely will never have.)

  • ZFS also had expansion for a long time but it was offline expansion. I don't know if btrfs has also had online for a long time?

    And shrinking no, that is a big missing feature in ZFS IMO. Understandable considering its heritage (large scale datacenters) but nevertheless an issue for home use.

    But raidz is rock-solid. Btrfs' raid is not.

> Most Linux distros still use ext4 by default, which is 19 years old, but ext4 is little more than a series of extensions on top of ext2, which is the same age as NTFS.

However, ext4 and XFS are much more simpler and performant than BTRFS & ZFS as root drives on personal systems and small servers.

I personally won't use either on a single disk system as root FS, regardless of how fast my storage subsystem is.

  • ZFS will outscale ext4 in parallel workloads with ease. XFS will often scale better than ext4, but if you use L2ARC and SLOG devices, it is no contest. On top of that, you can use compression for an additional boost.

    You might also find ZFS outperforms both of them in read workloads on single disks where ARC minimizes cold cache effects. When I began using ZFS for my rootfs, I noticed my desktop environment became more responsive and I attributed that to ARC.

    • No doubt. I want to reiterate my point. Citing myself:

      > "I personally won't use either on a single disk system as root FS, regardless of how fast my storage subsystem is." (emphasis mine)

      We are no strangers to filesystems. I personally benchmarked a ZFS7320 extensively, writing a characterization report, plus we have a ZFS7420 for a very long time, complete with separate log SSDs for read and write on every box.

      However, ZFS is not saturation proof, plus is nowhere near a Lustre cluster performance wise, when scaled.

      What kills ZFS and BTRFS on desktop systems are write performance, esp. on heavy workloads like system updates. If I need a desktop server (performance-wise), I'd configure it accordingly and use these, but I'd never use BTRFS or ZFS on a single root disk due to their overhead, to reiterate myself thrice.

      8 replies →

  https://openzfs.github.io/openzfs-docs/Getting%20Started/index.html

ZFS runs on all major Linux distros, the source is compiled locally and there is no meaningful license problem. In datacenter and "enterprise" environments we compile ZFS "statically" with other kernel modules all the time.

For over six years now, there is an "experimental" option presented by the graphical Ubuntu installer to install the root filesystem on ZFS. Almost everyone I personally know (just my anecdote) chooses this "experimental" option. There has been an occasion here and there of ZFS snapshots taking up too much space, but other than this there have not been any problems.

I statically compile ZFS into a kernel that intentionally does not support loading modules on some of my personal laptops. My experience has been great, others' mileage may (certainly will) vary.

ZFS on OS X was killed because of Oracle licensing drama. I don’t expect anything better on Windows either.

  • There is a third party port here:

    https://openzfsonosx.org/wiki/Main_Page

    It was actually the NetApp lawsuit that caused problems for Apple’s adoption of ZFS. Apple wanted indemnification from Sun because of the lawsuit, Sun’s CEO did not sign the agreement before Oracle’s acquisition of Sun happened and Oracle had no interest in granting that, so the official Apple port was cancelled.

    I heard this second hand years later from people who were insiders at Sun.

    • That’s a shame re: NetApp/ZFS.

      While third-party ports are great, they lack deep integration that first-party support would have brought (non-kludgy Time Machine which is technically fixed with APFS).

      1 reply →

  • > ZFS on OS X was killed because of Oracle licensing drama.

    It was killed because Apple and Sun couldn't agree on a 'support contract'. From Jeff Bonwick, one of the co-creators ZFS:

    >> Apple can currently just take the ZFS CDDL code and incorporate it (like they did with DTrace), but it may be that they wanted a "private license" from Sun (with appropriate technical support and indemnification), and the two entities couldn't come to mutually agreeable terms.

    > I cannot disclose details, but that is the essence of it.

    * https://archive.is/http://mail.opensolaris.org/pipermail/zfs...

    Sun took DTrace, licensed via CDDL—just like ZFS—and put it into the kernel without issue. Of course a file system is much more central to an operating system, so they wanted much more of a CYA for that.

License is not a real issue. It must be just distributed in separate module. No big hurdle.

  • The main hurdle is hostile Linux kernel developers who aren't held accountable intentionally breaking ZFS for their own petty ideological reasons e.g. removing the in-kernel FPU/SIMD register save/restore API and replacing it with a "new" API to do the the same.

    What's "new" about the "new" API? Its symbols are GPL2 only to deny it's use to non-GPL2 modules (like ZFS). Guess that's an easy way to make sure that BTRFS is faster than ZFS or set yourself up as the (to be) injured party.

    Of course a reimplementation of the old API in terms of the new is an evil "GPL condom" violating the kernel license right? Why can't you see ZFS's CDDL2 license is the real problem here for being the wrong flavour of copyleft license. Way to claim the moral high ground you short-sighted, bigoted pricks. sigh

  • From my point of view it is a real usability issue.

    zfs modules are not in the official repos. You either have to compile it on each machine or use unofficial repos, which is not exactly ideal and can break things if those repos are not up to date. And I guess it also needs some additional steps for secureboot setup on some distros?

    I really want to try zfs because btrfs has some issues with RAID5 and RAID6 (it is not recommended so I don't use it) but I am not sure I want to risk the overall system stability, I would not want to end up in a situation where my machines don't boot and I have to fix it manually.

    • I have been using ZFS on Mint and Alpine Linux for years for all drives (including root) and have never had an issue. It's been fantastic and is super fast. My linux/zfs laptop loads games much faster than an identical machine running Windows.

      I have never had data corruption issues with ZFS, but I have had both xfs and ext4 destroy entire discs.

    • Why are you considering raid5/6? Are you considering building a large storage array? If the data will fit comfortably (50-60% utilization) on one drive, all you need is raid1. Btrfs is fine for raid1 (raid1c3 for extra redundancy); it might have hidden bugs, but no filesystem is immune from those; zfs had a data loss bug (it was rare, but it happened) a year ago.

      Why use zfs for a boot partition? Unless you're using every disk mounting point and nvme slot for a single large raid array, you can use a cheap 512GB nvme drive or old spare 2.5" ssd for the boot volume. Or two, in btrfs raid1 if you absolutely must... but do you even need redundancy or datasum (which can hurt performance) to protect OS files? Do you really care if static package files get corrupted? Those are easily reinstalled, and modern quality brand SSDs are quite reliable.

      1 reply →

  • It is a problem because most of the internal kernel APIs are GPL-only, which limit the abilities of the ZFS module. It is a common source of argument between the Linux guys and the ZFS on Linux guys.

    The reason for this is not just to piss off non-GPL module developers. GPL-only internal APIs are subject to change without notice, even more so than the rest of the kernel. And because the licence may not allow the Linux kernel developers to make the necessary changes to the module when it happens, there is a good chance it breaks without warning.

    And even with that, all internal APIs may change, it is just a bit less likely than for the GPL-only ones, and because ZFS on Linux is a separate module, there is no guarantee for it to not break with successive Linux versions, in fact, it is more like a guarantee that it will break.

    Linux is proudly monolithic, and as constantly evolving a monolithic kernel, developers need to have control over the entire project. It is also community-driven. Combined, you need rules to have the community work together, or everything will break down, and that's what the GPL is for.

  • I remember it being a pain in the ass on Fedora which tracks closely to mainline. Frequently a new kernel version would come out that zfs module didn't support so you'd have to downgrade and hold back the package until support was added.

    Fedora packages zfs-fuse. I think some distros have arrangements to make sure kernels have zfs support. It may be less of a headache on those

    In tree fs don't break that way

You've been able to add and remove devices at will for a long time with btrfs (only recently supported in zfs with lots of caveats)

Btrfs also supports async/offline dedupe

You can also layer it on top of mdadm. Iirc zfs strongly discourages using anything but direct attached physical disks.