Comment by ianburrell
2 years ago
RAID is distributed across drives on one machine. That whole machine can fail. Plus, it can take a while to recover the machine or array and it is common for another drive to fail during recovery.
HDFS is distributed across multiple machines, each one which can have RAID. It is unlikely that enough machines will fail to lose data.
I believe that its essentially equivalent and neither raid nor hdfs are good enough to exist without backups.