Comment by zzzcpan
9 years ago
You don't have the same tools. You are probably thinking about emulating POSIX filesystem API and things like that and using those command-line tools on top of that in a single-box kind of way. That's not how you treat your distributed system.
EDIT: For something that beats a single box easily I envision an interpreter with JIT running on each node in a distributed system and on the same process that stores data, having pretty much no overhead to access and process it.
>You are probably thinking about emulating POSIX filesystem API and things like that and using those command-line tools on top of that in a single-box kind of way. That's not how you treat your distributed system.
Yeah, but Manta's mapreduce does something close, and it seems to work okay.