← Back to context

Comment by poisonborz

5 days ago

I just don't get it how the Windows world - by far the largest PC platform per userbase - still doesn't have any answer to ZFS. Microsoft had WinFS and then ReFS but it's on the backburner and while there is active development (Win11 ships some bits time to time) release is nowhere in sight. There are some lone warriors trying the giant task of creating a ZFS compatibility layer with some projects, but they are far from being mature/usable.

How come that Windows still uses a 32 year old file system?

To be honest, the situation with Linux is barely better.

ZFS has license issues with Linux, preventing full integration, and Btrfs is 15 years in the making and still doesn't match ZFS in features and stability.

Most Linux distros still use ext4 by default, which is 19 years old, but ext4 is little more than a series of extensions on top of ext2, which is the same age as NTFS.

In all fairness, there are few OS components that are as critical as the filesystem, and many wouldn't touch filesystems that have less than a decade of proven track record in production.

  • ZFS might be better then any other FS on Linux (I don't judge that).

    But you must admit that the situation on Linux is quite better then on Windows. Linux has so many FS in main branch. There is a lot of development. BTRFS had a rocky start, but it got better.

  • I’m interested to know what ‘full integration’ does look like, I use ZFS in Proxmox (Debian-based) and it’s really great and super solid, but I haven’t used ZFS in more vanilla Linux distros. Does Proxmox have things that regular Linux is missing out on, or are there shortcomings and things I just don’t realise about Proxmox?

    • The difference is that the ZFS kernel module is included by default with Proxmox, whereas with e.g. Debian, you would need to install it manually.

      6 replies →

    • You probably don’t realise how important encryption is.

      It’s still not supported by Proxmox, yes, you can do it yourself somehow but you are alone then and miss features and people report problems with double or triple file system layers.

      I do not understand how they have not encryption out of the box, this seems to be a problem.

      1 reply →

  • as far as stability goes, btrfs is used by meta, synology and many others, so I wouldn't say it's not stable, but some features are lacking

    • My understanding is that single-disk btrfs is good, but raid is decidedly dodgy; https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#raid5... states that:

      > The RAID56 feature provides striping and parity over several devices, same as the traditional RAID5/6.

      > There are some implementation and design deficiencies that make it unreliable for some corner cases and *the feature should not be used in production, only for evaluation or testing*.

      > The power failure safety for metadata with RAID56 is not 100%.

      I have personally been bitten once (about 10 years ago) by btrfs just failing horribly on a single desktop drive. I've used either mdadm + ext4 (for /) or zfs (for large /data mounts) ever since. Zfs is fantastic and I genuinely don't understand why it's not used more widely.

      15 replies →

    • I'm similar to some other people here, I guess once they've been bitten by data loss due to btrfs, it's difficult to advocate for it.

      3 replies →

    • Do Synology actually use the multi-device options of btrfs, or are they using linux softraid + lvm underneath?

      I know Synology Hybrid RAID is a clever use of LVM + MD raid, for example.

      2 replies →

  • > Btrfs [...] still doesn't match ZFS in features [...]

    Isn't the feature in question (array expansion) precisely one which btrfs already had for a long time? Does ZFS have the opposite feature (shrinking the array), which AFAIK btrfs also already had for a long time?

    (And there's one feature which is important to many, "being in the upstream Linux kernel", that ZFS most likely will never have.)

    • ZFS also had expansion for a long time but it was offline expansion. I don't know if btrfs has also had online for a long time?

      And shrinking no, that is a big missing feature in ZFS IMO. Understandable considering its heritage (large scale datacenters) but nevertheless an issue for home use.

      But raidz is rock-solid. Btrfs' raid is not.

      3 replies →

  • > Most Linux distros still use ext4 by default, which is 19 years old, but ext4 is little more than a series of extensions on top of ext2, which is the same age as NTFS.

    However, ext4 and XFS are much more simpler and performant than BTRFS & ZFS as root drives on personal systems and small servers.

    I personally won't use either on a single disk system as root FS, regardless of how fast my storage subsystem is.

    • ZFS will outscale ext4 in parallel workloads with ease. XFS will often scale better than ext4, but if you use L2ARC and SLOG devices, it is no contest. On top of that, you can use compression for an additional boost.

      You might also find ZFS outperforms both of them in read workloads on single disks where ARC minimizes cold cache effects. When I began using ZFS for my rootfs, I noticed my desktop environment became more responsive and I attributed that to ARC.

      21 replies →

  •   https://openzfs.github.io/openzfs-docs/Getting%20Started/index.html
    

    ZFS runs on all major Linux distros, the source is compiled locally and there is no meaningful license problem. In datacenter and "enterprise" environments we compile ZFS "statically" with other kernel modules all the time.

    For over six years now, there is an "experimental" option presented by the graphical Ubuntu installer to install the root filesystem on ZFS. Almost everyone I personally know (just my anecdote) chooses this "experimental" option. There has been an occasion here and there of ZFS snapshots taking up too much space, but other than this there have not been any problems.

    I statically compile ZFS into a kernel that intentionally does not support loading modules on some of my personal laptops. My experience has been great, others' mileage may (certainly will) vary.

  • ZFS on OS X was killed because of Oracle licensing drama. I don’t expect anything better on Windows either.

    • There is a third party port here:

      https://openzfsonosx.org/wiki/Main_Page

      It was actually the NetApp lawsuit that caused problems for Apple’s adoption of ZFS. Apple wanted indemnification from Sun because of the lawsuit, Sun’s CEO did not sign the agreement before Oracle’s acquisition of Sun happened and Oracle had no interest in granting that, so the official Apple port was cancelled.

      I heard this second hand years later from people who were insiders at Sun.

      2 replies →

    • > ZFS on OS X was killed because of Oracle licensing drama.

      It was killed because Apple and Sun couldn't agree on a 'support contract'. From Jeff Bonwick, one of the co-creators ZFS:

      >> Apple can currently just take the ZFS CDDL code and incorporate it (like they did with DTrace), but it may be that they wanted a "private license" from Sun (with appropriate technical support and indemnification), and the two entities couldn't come to mutually agreeable terms.

      > I cannot disclose details, but that is the essence of it.

      * https://archive.is/http://mail.opensolaris.org/pipermail/zfs...

      Sun took DTrace, licensed via CDDL—just like ZFS—and put it into the kernel without issue. Of course a file system is much more central to an operating system, so they wanted much more of a CYA for that.

  • License is not a real issue. It must be just distributed in separate module. No big hurdle.

    • The main hurdle is hostile Linux kernel developers who aren't held accountable intentionally breaking ZFS for their own petty ideological reasons e.g. removing the in-kernel FPU/SIMD register save/restore API and replacing it with a "new" API to do the the same.

      What's "new" about the "new" API? Its symbols are GPL2 only to deny it's use to non-GPL2 modules (like ZFS). Guess that's an easy way to make sure that BTRFS is faster than ZFS or set yourself up as the (to be) injured party.

      Of course a reimplementation of the old API in terms of the new is an evil "GPL condom" violating the kernel license right? Why can't you see ZFS's CDDL2 license is the real problem here for being the wrong flavour of copyleft license. Way to claim the moral high ground you short-sighted, bigoted pricks. sigh

    • From my point of view it is a real usability issue.

      zfs modules are not in the official repos. You either have to compile it on each machine or use unofficial repos, which is not exactly ideal and can break things if those repos are not up to date. And I guess it also needs some additional steps for secureboot setup on some distros?

      I really want to try zfs because btrfs has some issues with RAID5 and RAID6 (it is not recommended so I don't use it) but I am not sure I want to risk the overall system stability, I would not want to end up in a situation where my machines don't boot and I have to fix it manually.

      3 replies →

    • It is a problem because most of the internal kernel APIs are GPL-only, which limit the abilities of the ZFS module. It is a common source of argument between the Linux guys and the ZFS on Linux guys.

      The reason for this is not just to piss off non-GPL module developers. GPL-only internal APIs are subject to change without notice, even more so than the rest of the kernel. And because the licence may not allow the Linux kernel developers to make the necessary changes to the module when it happens, there is a good chance it breaks without warning.

      And even with that, all internal APIs may change, it is just a bit less likely than for the GPL-only ones, and because ZFS on Linux is a separate module, there is no guarantee for it to not break with successive Linux versions, in fact, it is more like a guarantee that it will break.

      Linux is proudly monolithic, and as constantly evolving a monolithic kernel, developers need to have control over the entire project. It is also community-driven. Combined, you need rules to have the community work together, or everything will break down, and that's what the GPL is for.

    • I remember it being a pain in the ass on Fedora which tracks closely to mainline. Frequently a new kernel version would come out that zfs module didn't support so you'd have to downgrade and hold back the package until support was added.

      Fedora packages zfs-fuse. I think some distros have arrangements to make sure kernels have zfs support. It may be less of a headache on those

      In tree fs don't break that way

  • You've been able to add and remove devices at will for a long time with btrfs (only recently supported in zfs with lots of caveats)

    Btrfs also supports async/offline dedupe

    You can also layer it on top of mdadm. Iirc zfs strongly discourages using anything but direct attached physical disks.

> How come that Windows still uses a 32 year old file system?

Simple. Because most of the burden is taken by the (enterprise) storage hardware hosting the FS. Snapshots, block level deduplication, object storage technologies, RAID/Resiliency, size changes, you name it.

Modern storage appliances are black magic, and you don't need much more features from NTFS. You either transparently access via NAS/SAN or store your NTFS volumes on capable disk boxes.

On the Linux world, at the higher end, there's Lustre and GPFS. ZFS is mostly for resilient, but not performance critical needs.

  • >ZFS is mostly for resilient, but not performance critical needs.

    Los Alamos disagrees ;)

    https://www.lanl.gov/media/news/0321-computational-storage

    But yes, in general you are right, Cern for example uses Ceph:

    https://indico.cern.ch/event/1457076/attachments/2934445/515...

    • I think what LLNL did predates GPUDirect and other new technologies came after 2022, but that's a good start.

      CERN's Ceph also for their "General IT" needs. Their clusters are independent from that. Also CERN's most processing is distributed across Europe. We are part of that network.

      Many, if not all of the HPC centers we talk with uses Lustre as their "immediate" storage. Also, there's Weka now, a closed source storage system supporting insane speeds and tons of protocols at the same time. Mostly used for and by GPU clusters around the world. You connect terabits to that cluster casually. It's all flash, and flat out fast.

      2 replies →

  • So private consumers should just pay cloud subscription if they want safer/modern data storage for their PC? (without NAS)

    • No, private consumers have a choice, since Linux and FreeBSD runs well on their hardware. Microsoft is too busy shoveling their crappy AI and convincing OEMs to put a second Windows button (the CoPilot button) on their keyboards.

    • Probably. There are levels of backups, and a cloud subscription SHOULD give you copies in geographical separate locations with someone to help you (who probably isn't into computers and doesn't want to learn the complex details) restore when (NOT IF!) needed.

      I have all my backups on a NAS in the next room. This covers the vast majority of use cases for backups, but if my house burns down everything is lost. I know I'm taking that risk, but really I should have better. Just paying someone to do it all in the cloud should be better for me as well and I keep thinking I should do this.

      Of course paying someone assumes they will do their job. There are always incompetent companies out there to take your money.

      1 reply →

    • If you need Windows, you can use something like restic (checksums and compression) and external drives (more than one, stored in more than one place) to make a backup. Plus "maybe" but not needed ReFS (on your non-Windows partition), which is included in the Workstation/Enterprise editions of Windows.

      I trust my own backups much more than any subscription, not essentially from a technical point of view, but from an access point of view (e.g. losing access to your Google account).

      EDIT: You have to enable check-summing and/or compression for data on ReFS manually

      https://learn.microsoft.com/en-us/windows-server/storage/ref...

      7 replies →

    • I think Microsoft has discontinued Windows 7 backup to force people to buy OneDrive subscriptions. They also forcefully enabled the feature when they first introduced it.

      So, I think that your answer for this question is "unfortunately, yes".

      Not that I support the situation.

    • Having a NAS is life-changing. Doesn't have to be some large 20-bay monstrosity, just something that will give you redundancy and has an ethernet jack.

    • No, if they need ZFS-like function, they just pay for NAS.

      ZFS is not in the same market with AWS S3.

> I just don't get it how the Windows world - by far the largest PC platform per userbase - still doesn't have any answer to ZFS.

The mainline Linux kernel doesn't either, and I think the answer is because it's hard and high risk with a return mostly measured in technical respect?

  • Technically speaking, bcachefs has been merged into the Linux Kernel - that makes your initial assertion wrong.

    But considering it's had two drama events within 1 year of getting merged... I think we can safely confirm your conclusion of it being really hard

    • > Technically speaking, bcachefs has been merged into the Linux Kernel - that makes your initial assertion wrong.

      bcachefs doesn't implement its erasure coding/RAID yet? Doesn't implement send/receive. Doesn't implement scrub/fsck. See: https://bcachefs.org/Roadmap, https://bcachefs.org/Wishlist/

      btrfs is still more of a legit competitor to ZFS these days and it isn't close to touching ZFS where it matters. If the perpetually half-finished bcachefs and btrfs are the "answer" to ZFS that seems like too little, too late to me.

      13 replies →

Honest question. As an end user that uses Windows and Linux and does not uses ZFS, what I am missing?

  • Way better data security, resilience against file rotting. This goes for both HDDs or SSDs. Copy-on-write, snapshots, end to end integrity. Also easier to extend the storage for safety/drive failure (and SSDs corrupt in a more sneaky way) with pools.

    • How many of us are using single disks on our laptops? I have a NAS and use all of the above but that doesn’t help people with single drive systems. Or help me understand why I would want it on my laptop.

      11 replies →

    • The data security and rot resilience only goes for systems with ECC memory. Correct data with a faulty checksum will be treated the same as incorrect data with a correct checksum.

      Windows has its own extended filesystem through Storage Spaces, with many ZFS features added as lesser used Storage Spaces options, especially when combined with ReFS.

      6 replies →

  • For a while I ran Open Solaris with ZFS as root filesystem.

    The key feature for me, which I miss, is the snapshotting integrated into the package manager.

    ZFS allows snapshots more or less for free (due to copy on weite) including cron based snapshotting every 15 minutes. So if I did a mistake anywhere there was a way to recover.

    And that integrated with the update manager and boot manager means that on an update a snapshot is created and during boot one can switch between states. Never had a broken update, but gave a good feeling.

    On my home server I like the raid features and on Solaris it was nicely integrated with NFS etc so that one can easily create volumes and export them and set restrictions (max size etc.) on it.

    • > is the snapshotting integrated into the package manager.

      some linux distros have that by default with btrfs. And usually it's a package install away if you're already on btrfs.

  • Much faster launch of applications/files you use regularly. Ability to always rollback updates in seconds if they cause issues thanks to snapshots. Fast backups with snapshots + zfs send/receive to a remote machine. Compressed disks, this both let's you store more on a drive and makes accessing files faster. Easy encryption. ability to mirror 2 large usb disks so you never have your data corrupted or lose it from drive failures. Can move your data or entire os install to a new computer easily by using a live disk and just doing a send/receive to the new pc.

    (I have never used dedup, but it's there if you want I guess)

  • Online filesystem checking and repair.

    Reading any file will tell you with 100% guarantee if it is corrupt or not.

    Snapshots that you can `cd` into, so you can compare any prior version of your FS with the live version of your FS.

    Block level compression.

    • >Reading any file will tell you with 100% guarantee if it is corrupt or not.

      Only possible if it was not corrupted in RAM before it was written to disk.

      Using ECC memory is important, irrespective of ZFS.

  • Snapshots (Note: NTFS does have this in the way of Volume Shadow Copy but it's not as easily accessible as a feature to the end user as it is in ZFS). Copy on Write for reliability under crashes. Block checksumming for data protection (bitrot)

NTFS was able to be extended in various way over the years to the point what you could do with an NTFS drive 32 years ago will feel like talking about a completely different filesystem than what you can do with it on current Windows.

Honestly I really like ReFS, particularly in context of storage spaces, but I don't think it's relevant to Microsoft's consumer desktop OS where users don't have 6 drives they need to pool together. Don't get me wrong, I use ZFS because that's what I can get running on a Linux server and I'm not going to go run Windows Server just for the storage pooling... but ReFS + Storage Spaces wins my heart with the 256 MB slab approach. This means you can add+remove mixed sized drives and get the maximum space utilization for the parity settings of the pool. Here ZFS is still getting to online adds of same or larger drives 10 years later.

OS development pretty much stopped around 2000. ZFS is from 2001. I don't count a new way to organise my photos or integrate with a search engine as "OS" though.

The same reason file deduplication is not enabled for client Windows: greed.

For example, there are numerous new file systems people use: OneDrive, Google Drive, iCloud Storage. Do you get it?

NTFS is good enough for most people, who have a laptop with one SSD in it.

  • The benefits of ZFS don't need multiple drives to be useful. I'm running ZFS on root for years now and snapshots have saved my bacon several times. Also with block checksums you can at least detect bitrot. And COW is always useful.

    • Windows manages volume snapshots on NTFS through VSS. I think ZFS snapshots are a bit "cleaner" of a design, and the tooling is a bit friendlier IMO, but the functionality to snapshot, rollback, and save your bacon is there regardless. Outside of the automatically enabled "System Restore" (which only uses VSS to snapshot specific system files during updates) I don't think anyone bothers to use it though.

      CoW, advanced parity, and checksumming are the big ones NTFS lacks. CoW is just inherently not how NTFS is designed and checksumming isn't there. Anything else (encryption, compression, snapshots, ACLs, large scale, virtual devices, basic parity) is done through NTFS on Windows.

      2 replies →