Comment by rusanu

9 years ago

60TB at 500Mb/s transfer will take +1 day to read the data. This is the problem of drinking the ocean through a straw. Even with SSD transfer rates, is still a problem at scale. Clusters give you no only capacity, but also multiplication factor for transfer rates.

2 comments

rusanu

faragon 9 years ago

Just use 24 of them interleaved/stripped and it will take just one hour for loading the data.

rusanu 9 years ago

But then you need small disks (eg. 2TB). My point is that huge capacity drives are not appropriate in compute environments, as Hadoop is. They're more for cold storage.