Comment by kalleboo

1 month ago

> HDDs typically have a BER (Bit Error Rate) of 1 in 1015, meaning some incorrect data can be expected around every 100 TiB read. That used to be a lot, but now that is only 3 or 4 full drive reads on modern large-scale drives

I remember this argument way back 16 years ago when the "Why RAID 5 stops working in 2009" article[0] blew up. It's BS. Those aren't the actual average error rates. Those are inflated error rates below which the manufacturer does not want to bother supplying a warranty for.

I have a pool with 260 TB worth of 10/14 TB disks in it 80% full, with monthly scrubs going back years. Not a single checksum error, and in total something like 30 reallocated sectors seen in SMART (half of those on a 7 year old drive).

[0] https://www.zdnet.com/article/why-raid-5-stops-working-in-20...

Agreed. I have a couple of servers each with 168 hard drives, about 6 years old. A few hard drives are starting to fail. ZFS counts read errors (the drive reported an error because its checksum didn't match) and checksum errors (the drive returned data that was actually incorrect and ZFS caught it with a checksum). I have seen lots of read errors, but not a single checksum error yet. Though these are server-grade drives, which might be better than consumer-grade.