Comment by arjie
9 years ago
So you have large data storage, and processing that can handle large data (assuming for convenience that you have a conventional x86 processor with that throughput). The only problem that remains is moving things from the former to the latter, and then back again once you're done calculating.
That's (100 * 1024 GB) / (20 GB/s) = 85 minutes just to move your 100 TB to the processor assuming your storage can operate at the same speed as DDR4 RAM. A 100 node Hadoop cluster has (100 * 1024 GB) / (0.2 * 100 GB/s) throughput with commodity disks.
Back-of-the-envelope stuff, obviously, with caveats everywhere.
No comments yet
Contribute on Hacker News ↗