Comment by ryao
5 days ago
I am not sure if reshaping is a reasonable thing. It is not so reasonable in other fields. In architecture, if you build a bridge and then want more lanes, you usually build a new bridge, rather than reshape the bridge. The idea of reshaping a bridge while cars are using it would sound insane there, yet that is what people want from storage stacks.
Reshaping traditional storage stacks does not consider all of the ways things can go wrong. Handling all of them well is hard, if not impossible to do in traditional RAID. There is a long history of hardware analogs to MD RAID killing parity arrays when they encounter silent corruption that makes it impossible to know what is supposed to be stored there. There is also the case where things are corrupted such that there is a valid reconstruction, but the reconstruction produces something wrong silently.
Reshaping certainly is easier to do with MD RAID, but the feature has the trade off that edge cases are not handled well. For most people, I imagine that risk is fine until it bites them. Then it is not fine anymore. ZFS made an effort to handle all of the edge cases so that they do not bite people and doing that took time.
> I am not sure if reshaping is a reasonable thing.
Yet people are celebrating when ZFS adds it. Was it all for nothing?
People wanted it, but it was very hard to do safely. While ZFS now can do it safely, many other storage solutions cannot.
Those corruption issues I mentioned, where the RAID controller has no idea what to do, affect far more than just reshaping. They affect traditional RAID arrays when disks die and when patrol scrubs are done. I have not tested MD RAID on edge cases lately, but the last time I did, I found MD RAID ignored corruption whenever possible. It would not detect corruption in normal operation because it assumed all data blocks are good unless SMART said otherwise. Thus, it would randomly serve bad data from corrupted mirror members and always serve bad data from RAID 5/6 members whenever the data blocks were corrupted. This was particularly tragic on RAID 6, where MD RAID is hypothetically able to detect and correct the corruption if it tried. Doing that would come with such a huge performance overhead that it is clear why it was not done.
Getting back to reshaping, while I did not explicitly test it, I would expect that unless a disk is missing or disappears during a reshape, MD RAID would ignore any corruption that can be detected using parity and assume all data blocks are good just like it does in normal operation. It does not make sense for MD RAID to look for corruption during a reshape operation, since not only would it be slower, but even if it finds corruption, it has no clue how to correct the corruption unless RAID 6 is used, there are no missing/failed members and the affected stripe does not have any read errors from SMART detecting a bad sector that would effectively make it as if there was a missing disk.
You could do your own tests. You should find that ZFS handles edge cases where the wrong thing is in a spot where something important should be gracefully while MD RAID does not. MD RAID is a reimplementation of a technology from the 1960s. If 1960s storage technology handled these edge cases well, Sun Microsystems would not have made ZFS to get away from older technologies.
> While ZFS now can do it safely ...
It's the first release with the code, so "safely" might not be the right description until a few point releases happen. ;)
1 reply →