← Back to context

Comment by Hamuko

6 years ago

That's his reasoning for not merging ZFS code, not for generally avoiding ZFS.

Here are his reasons for generally avoiding ZFS from what I consider most important to least.

- The kernel team may break it at any time, and won't care if they do.

- It doesn't seem to be well-maintained.

- Performance is not that great compared to the alternatives.

- Using it opens you up to the threat of lawsuits from Oracle. Given history, this is a real threat. (This is one that should be high for Linus but not for me - there is no conceivable reason that Oracle would want to threaten me with a lawsuit.)

  • I'm baffled by such arguments.

    > It doesn't seem to be well-maintained.

    The last commit is from 3 hours ago: https://github.com/zfsonlinux/zfs/commits/master. They have dozens of commits per month. The last minor release, 0.8, brought significant improvements (my favorite: FS-level encryption).

    Or maybe this is referred to the 5.0 kernel (initial) incompatibility? That wasn't the ZFS dev team's fault.

    > Performance is not that great compared to the alternatives.

    There are no (stable) alternatives. BTRFS certainly not, as it's "under heavy development"¹ (since... forever).

    > The kernel team may break it at any time, and won't care if they do.

    That's true, however, the amount is breakage is no different from any other out-of-tree module, and it unlikely to happen with a patch version of a working kernel (in fact, it happen with the 5.0 release).

    > Using it opens you up to the threat of lawsuits from Oracle. Given history, this is a real threat. (This is one that should be high for Linus but not for me - there is no conceivable reason that Oracle would want to threaten me with a lawsuit.)

    "Using" it won't open to lawsuits; ZFS has a CDDL license, which is a free and open-source software license.

    The problem is (taking Ubuntu as representative) shipping the compiled module along with the kernel, which is an entirely different matter.

    ---

    [¹] https://btrfs.wiki.kernel.org/index.php/Main_Page#Stability_...

    • > ZFS has a CDDL license

      Java is GPLv2+CPE. That didn't stop Oracle because, as Linus pointed out in the email, Oracle regards their APIs as a separate entity to their code.

      3 replies →

    • > There are no (stable) alternatives. BTRFS certainly not, as it's "under heavy development"¹ (since... forever).

      Note that they don't mean "it's unstable," just "there are significant improvements between versions." Most importantly:

      > The filesystem disk format is stable; this means it is not expected to change unless there are very strong reasons to do so. If there is a format change, filesystems which implement the previous disk format will continue to be mountable and usable by newer kernels.

      ...and only _new features_ are expected to stabilise:

      > As with all software, newly added features may need a few releases to stabilize.

      So overall, at least as far as their own claims go, this is not "heavy development" as in "don't use."

      18 replies →

    • > There are no (stable) alternatives. BTRFS certainly not, as it's "under heavy development"¹ (since... forever).

      Unless you are living in 2012 on a RHEL/CENTOS 6/7 machine, btrfs has been stable for way too long. I have been using btrfs as the sole filesystem on my laptop in standard mode, on my desktop as RAID0 and my NAS as RAID1 for more that two years. I have experienced absolutely zero data loss. Infact, btrfs recovered my laptop and desktop from broken package updates many times.

      You might have had some issues when you tried btrfs on distros like RHEL that did not backport the patches to their stable versions because they don't support btrfs commercially. Try something like openSUSE that backports btrfs patches to stable versions or use something like arch.

      > That's true, however, the amount is breakage is no different from any other out-of-tree module, and it unlikely to happen with a patch version of a working kernel (in fact, it happen with the 5.0 release).

      This is a filesystem that we are talking. In no circumstances will any self respecting sysadmin use a file system that has even a small change of breaking with a system update.

      4 replies →

  • A former employer was threatened by Oracle because some downloads for the (only free for noncommercial use) VirtualBox Extension Pack came from an IP block owned by the organization. Home users are probably safe, but Oracle's harassment engine has incredible reach.

    • My employer straight up banned the use of VirtualBox entirely _just in case_. They'd rather pay for VMWare Fusion licenses than deal with any potential crap from Oracle.

      4 replies →

    • Well ... that sounds initially unreasonable, but then if I think about it a bit more I'm not sure how you'd actually enforce a non-commercial use only license without some basic heuristic like "companies are commercial".

      Is the expectation here that firms offering software under non-commercial-use-is-free licenses just run it entirely on the honour system? And isn't it true that many firms use unlicensed software, hence the need for audits?

      8 replies →

  • > There is no conceivable reason that Oracle would want to threaten me with a lawsuit.

    I don't think it has to be conceivable with Oracle...

    Unfortunately I have to agree with Linus on this one. Messing with Oracle's stuff is dangerous if you can't afford a comparable legal team.

    • "Oracle's stuff" can most often be described more accurately as "what Oracle considers its stuff".

    • Linus is distributing the kernel, a very different beast from using a kernel module. I can't imagine Oracle targeting someone for using ZFS on Linux without first establishing that the distribution of ZFS on Linux is illegal.

  • > there is no conceivable reason that Oracle would want to threaten me with a lawsuit.

    Money. Anecdotally that's the primary reason Oracle do anything.

    • If anyone thinks this is hyperbole :

      I worked for a tiny startup (>2 devs full time) where Oracle tried to extract money from us because we used MariaDB on AWS.

      If you think this sounds ridiculous you probably got it right.

      (Why? Because someone inexperienced with Oracle had filled out the form while downloading the mySQL client.)

      3 replies →

  • "there is no conceivable reason that Oracle would want to threaten me with a lawsuit."

    Don't be so sure about this.

  • None of these are good reasons to purposely hinder the optional use of ZFS as a third party module by users, which is what Linux is doing.

    • Can you expand? I'm no expert - use linux daily but have always just used distro default file system. Linus' reasons for not integrating seems pretty sensible to me. Oracle certainly has form on the litigation front.

      20 replies →

    • This want a case of "purposely hinder", but rather the zfs nodule broke because of some kernel changes. The kernel is careful to never break userspace and never break its own merged modules. But if you're a third-party module then you're on your own. The kernel developers can't be responsible for maintaining compatibility with your stuff.

      1 reply →

  • > - Performance is not that great compared to the alternatives.

    CoW filesystems do trade performance for data safety. Or did you mean there are other _stable/production_ CoW filesystems with better performance? If so, please do point them out!

  • >- Using it opens you up to the threat of lawsuits from Oracle. Given history, this is a real threat. (This is one that should be high for Linus but not for me - there is no conceivable reason that Oracle would want to threaten me with a lawsuit.)

    No. Distributing (ie. precompiled distro with ZFS) will. You are free to run any software on your machine as you so desire.

  • This reminds me of the adaptation of a Churchill quote that "ZFS is the worst of the file systems, except for all others."

The problem with ZFS is that it isn't part of Linux kernel.

Linux project maintains compatibility with userspace software but it does not maintain compatibility with 3rd party modules and for a good reason.

Since modules have access to any internal kernel API it is not possible to change anything within kernel without considering 3rd party code, if you want to keep that code working.

For this reason the decision was made that if you want your module to work you need to make it part of Linux kernel and then if anybody refactors anything they need to consider modules they would be affecting by the change.

Not allowing the module to be part of the kernel is a disservice to your user base. While there are modules like that that are maintained moderately successfully (Nvidia, vmware, etc.) this is all at the cost of the user and userspace maintainers who have to deal with it.

  • It isn't just ZFS. All sorts of drivers get broken because Linux refuses to offer a stable API, saying your code should be in the kernel, but also often refuses to accept drivers into the kernel, even open-source code with no particular quality issues (e.g. quickcam, reiserfsv4).

    Use FreeBSD where there's a stable ABI and you don't have these problems.

  • Parent updated their post and my comment is no longer relevant.

    • I don't see how it's an insult to the users. It's saying that not allowing ZFS code to be distributed under the GPL and be maintained as part of the Linux kernel, is a disservice to ZFSonLinux users. Which I think is clearly right.

And he was doing fine up to that point. For IMO good reasons, ZFS will likely never be merged into Linux. And filesystem kernel modules from third parties have a pretty long history of breakage issues going back to some older Unixes.

That's going to be plenty of reason not to use ZFS for most people. The licensing by itself is also certainly a showstopper for many.

But I'm not sure his other comments are really fair and, had Oracle relicensed ZFS n years back, ZFS would almost certainly be shipping with Linux, whether or not as the typical default I can't say. It certainly wasn't just a buzzword and there were a number of interesting aspects to its approach.

Well, he says

> It was always more of a buzzword than anything else, I feel, and the licensing issues just make it a non-starter for me.

So presumably the licensing problem mentioned by your parent's comment is weighing heavily here. I think this "don't use ZFS" statement is most accurately targeted at distro maintainers. Anyone not actually redistributing Linux and ZFS in a way that would (maybe) violate the GPL is not at any risk. That means even large enterprises can get away with using ZoL.

It's exactly that, when combined with the longstanding practice of maintaining compatibility with userspace, but reserving the right to refactor kernel-space code whenever and wherever needed. If ZFS-on-linux breaks in a subtle or obvious way due to a change in linux, he can't afford to care about that - keeping the linux kernel codebase sane while adding new features, supported hardware, optimizations, and fixes at an honestly scary rate, is not that easy.

See also https://www.kernel.org/doc/html/latest/process/stable-api-no...

(fuse is a stable user-space API if you want one ... it won't have the same performance and capabilities of course ...)

  • > he can't afford to care about that - keeping the linux kernel codebase sane while adding new features, supported hardware, optimizations, and fixes at an honestly scary rate, is not that easy.

    Maybe, but the complains seem to be more related to the (problematic) changes not being of technical nature accidentally braking ZFS, but being more of political nature. With speculation that it might have been meant to _intentionally_ brake ZFS and then pretend this was a accident because ZFS isn't (and can never) be maintained in tree. Basically on the line of "we don't like out of tree kernel modules so we make the live hard for them". No idea if this is actually the case or people just spin thinks together. Even if it is the case I'm not sure what I should think about, because it's at least partially somewhat understandably.

    • Linus is rather tolerant (or apathetic) about non-GPL modules, but what he doesn't care to do is ensure that there is an appropriate set of non-GPL-marked exports available for external modules. If some other developer happens to mark some export GPL and it happens to be one key export needed by a non-GPL external module, Linus doesn't care, because he doesn't care about external modules.

      This has come up many times in the past. Keep in mind that linux has always been GPLv2-only, it is not LGPL or anything like that.

      https://lwn.net/Articles/769471/

      https://lwn.net/Articles/603131/

      https://lkml.org/lkml/2012/2/7/451

"Don't use ZFS. It's that simple. It was always more of a buzzword than anything else, I feel, and the licensing issues just make it a non-starter for me."

When he says that, I think on the $500 million Sun spent on advertising java.

  • Sun isn't going to sue anyone into oblivion any time soon, but Oracle sure will

    • Sun is all but defunct, I don't think I would characterize it as a subsidiary of Oracle.

    • That's kinda non-sensical IMO. If Oracle, the parent company is trigger happy, there are no guarantees they won't go deeper to protect their children companies IP if they feel they're being infringed.

    • I was thinking more of "the buzzword" bit, and how it got to be such a well known technology.

Well he had this:

> as far as I can tell, it has no real maintenance behind it either any more

Which simply isn't true. They just released a new ZFS version with encryption built in (no more ZFS + LUKS) and they removed the SPL dependency (which didn't support Linux 5.0+ anyway).

I use ZFS on my Linux machines for my storage and I've been rather happy with it.

  • Same, for at least 6 years in a 4 drive zraid array. It always reads and writes at full gigabit ethernet speeds and I haven't had any downtime other than maintaining FreeBSD updates which are trivial even when going from 10.x to 11 to 12.

    • "Same" for the last ~4 years, starting with 8 disks and as of 2018, the 24-bay enclosure is full. Each vdev is a mirrored pair split across HBAs to sedate my paranoia. I've replaced a few drives after watching unreadable sector count slowly increase over a few months. I've also switched out most of the original 3TB pairs to 8TB and 10TB pairs. ~42TB usable and the box only has 16GB of RAM (because I can't get the used 32GB sticks to work, it's a picky mainboard and difficult to find matching ECC memory here in Europe). I haven't powered down much except to attempt to replace the RAM or during extremely hot days. Read/write speed is more or less max gigabit, even during rebuild after hot-swapping drives.

    • Same here (4-drive raidz for many years), though I do have an issue where deleting large files (~1 GB) takes around a minute and nobody seems to know why (I have plenty free space and RAM)...

      6 replies →

    • A single 5400 rpm drive (the like of wd red) should be able to saturate gigabit ethernet. 4 drive array should be basically idling.

Relevant bits:

"Don't use ZFS. It's that simple. It was always more of a buzzword than anything else, I feel, and the licensing issues just make it a non-starter for me.

The benchmarks I've seen do not make ZFS look all that great. And as far as I can tell, it has no real maintenance behind it either any more, so from a long-term stability standpoint, why would you ever want to use it in the first place?"

  • > The benchmarks I've seen do not make ZFS look all that great.

    The thing about ZFS that actually appeals to me is how much error-checking it does. Checksums/hashes are kept of both data and metadata, and those checksums are regularly checked to detect and fix corruption. As far as I know it (and filesystems with similar architectures) are the only ones that can actually protect against bit rot.

    https://github.com/zfsonlinux/zfs/wiki/Checksums

    > And as far as I can tell, it has no real maintenance behind it either any more, so from a long-term stability standpoint, why would you ever want to use it in the first place?"

    It has as much maintenance as any open source project: http://open-zfs.org/. IIRC, it has more development momentum behind it than the competing btrfs project.

    • > those checksums are regularly checked to detect and fix corruption.

      I don't believe that's true. They are checked on access, but if left alone, nothing will verify them. From what I've read, you need to setup a cron job that runs scrubbing on some regular schedule.

      2 replies →

  • Linus is just wrong as far as maintenance, as a look at the linux-zfs lists would show.

    From my perspective, it has no real competitor under linux, which is why I use it. I don't consider brtfs mature enough for critical data. (Others can reasonably disagree, I have intentionally high standards for data durability.)

    Aside from legal issues, he's talking out of his ass.

  • Not sure where that belief comes from. But it might be that many benchmarks are naive and compare it against other filesystems in single-disc setups with zero tuning. Since its metadata overheads are higher, it's definitely slower in this scenario. However, put a pool onto an array of discs and tune it a little, and the performance scales up and up leaving all Linux-native filesystems, and LVM/dm/mdraid, well behind. It's a shame that Linux has nothing compelling to do better than this.

    • Last time I used ZFS write performance was terrible compared to an ordinary RAID5. IIRC Writes in a raidz are always limited to a single disk’s performance. The only way to get better write speed is to combine multiple raidzs - which means you need a boatload if disks.

      23 replies →

  • I think speed is not the primary reason many (most?) people use ZFS; I think it's mostly about stability, reliability and maintainability.