It's not a bug, but an unimplemented feature. They never made any promise that raid5 is production-ready.
Pretty much all software-raid systems suffer from it unless they explicitly patch over it via journaling. Hardware raid gets away with it if it has battery backups, if they don't they suffer from exactly the same problem.
My home NAS runs btrfs in RAID 5. The key is to use software RAID / LVM to present a single block device to btrfs. That way you never use btrfs's screwed-up RAID 5/6 implementation.
Why use RAID5/6, RAID10 is much more safe because you drastically reduce the change of a cascading resilvering failure. Yes, you get less capacity per drive, but drives are (relatively) cheap.
I thought I wanted RAID5, but after reading horror stories of drives failing when replacing a failed drive, I decided it just wasn't worth the risk.
I currently run RAID1, and when I need more space, I'll double my drives and set up RAID10. I don't need most of the features of ZFS, so BTRFS works for me.
I use RAID6 because it gives me highly efficient utilization of my available storage capacity while still giving me some degree of redundancy should a disk fail. My workload is also mostly sequential, so random read/write performance isn't too important to me.
If a disk fails and resilvering causes a cascading failure, I can restore from a backup.
I think you might be mistaking RAID for a backup, which is a mistake. RAID is very much not a backup or any kind of substitute for a backup. A backup ensures durability and integrity of your data by providing an independent fallback should your primary storage fail. RAID ensures availability of your data by keeping your storage online when up to N disks fail.
RAID won't protect you from an accidental "rm -Rf /", ransomware or other malware, bugs in your software or many other common causes of data loss.
I might consider RAID10 if I were running a business-critical server where availability was paramount, or where I needed decent random read/write performance but even so I'd still want a hot-failover and a comprehensively tested backup strategy.
btrfs is not at all reliable, so if you care about your files staying working files, it probably doesn't meet your requirements. It is like the MongoDB 0.1 of filesystems.
When it comes to file systems “pretty reliable” these days does not sound very good. Reliability had to have been a fundamental requirement for design of a file system. If not, it sounds like putting lipstick on a pig.
Redhat throwing towel on their support for development does not instill confidence either.
Nothing personally against Btrfs. Just an end user making a file system choice saying what I care about.
I've tried btrfs without much luck.
btrfs still has a write hole for RAID5/6 (the kind I primarily use) [0] and has since at least 2012.
For a filesystem to have a bug leading to dataloss unpatched for over 8 years is just plain unacceptable.
I've also had issues even without RAID, particularly after power outages. Not minor issues but "your filesystem is gone now, sorry" issues.
[0]: https://btrfs.wiki.kernel.org/index.php/RAID56
It's not a bug, but an unimplemented feature. They never made any promise that raid5 is production-ready.
Pretty much all software-raid systems suffer from it unless they explicitly patch over it via journaling. Hardware raid gets away with it if it has battery backups, if they don't they suffer from exactly the same problem.
... hence the desire to use ZFS, which skips trying to present a single coherent block device and performs parity at the file (chunk) level.
My home NAS runs btrfs in RAID 5. The key is to use software RAID / LVM to present a single block device to btrfs. That way you never use btrfs's screwed-up RAID 5/6 implementation.
If you use LVM/mdadm for RAID, it's not possible for btrfs to correct checksum mismatches (i.e. protect against bitrot).
2 replies →
Why use RAID5/6, RAID10 is much more safe because you drastically reduce the change of a cascading resilvering failure. Yes, you get less capacity per drive, but drives are (relatively) cheap.
I thought I wanted RAID5, but after reading horror stories of drives failing when replacing a failed drive, I decided it just wasn't worth the risk.
I currently run RAID1, and when I need more space, I'll double my drives and set up RAID10. I don't need most of the features of ZFS, so BTRFS works for me.
I use RAID6 because it gives me highly efficient utilization of my available storage capacity while still giving me some degree of redundancy should a disk fail. My workload is also mostly sequential, so random read/write performance isn't too important to me.
If a disk fails and resilvering causes a cascading failure, I can restore from a backup.
I think you might be mistaking RAID for a backup, which is a mistake. RAID is very much not a backup or any kind of substitute for a backup. A backup ensures durability and integrity of your data by providing an independent fallback should your primary storage fail. RAID ensures availability of your data by keeping your storage online when up to N disks fail.
RAID won't protect you from an accidental "rm -Rf /", ransomware or other malware, bugs in your software or many other common causes of data loss.
I might consider RAID10 if I were running a business-critical server where availability was paramount, or where I needed decent random read/write performance but even so I'd still want a hot-failover and a comprehensively tested backup strategy.
btrfs is not at all reliable, so if you care about your files staying working files, it probably doesn't meet your requirements. It is like the MongoDB 0.1 of filesystems.
Seems pretty reliable these days. Are you commenting based upon personal experience? If so, when was it that you used btrfs?
When it comes to file systems “pretty reliable” these days does not sound very good. Reliability had to have been a fundamental requirement for design of a file system. If not, it sounds like putting lipstick on a pig.
Redhat throwing towel on their support for development does not instill confidence either.
Nothing personally against Btrfs. Just an end user making a file system choice saying what I care about.
1 reply →
I have a laptop running opensuse, with root on btrfs. Twice I have had to reinstall because it managed to corrupt the file system.