← Back to context

Comment by kuschku

9 years ago

Yes, there is, you have a lot of overhead in any case for the same tools.

You don't have the same tools. You are probably thinking about emulating POSIX filesystem API and things like that and using those command-line tools on top of that in a single-box kind of way. That's not how you treat your distributed system.

EDIT: For something that beats a single box easily I envision an interpreter with JIT running on each node in a distributed system and on the same process that stores data, having pretty much no overhead to access and process it.

  • >You are probably thinking about emulating POSIX filesystem API and things like that and using those command-line tools on top of that in a single-box kind of way. That's not how you treat your distributed system.

    Yeah, but Manta's mapreduce does something close, and it seems to work okay.