Comment by toast0
9 years ago
My hadoop experience is dated (circa 2011), do the work nodes still poll the scheduler to see if they have work to do? If so, that's still a giant impediment to speed for smaller tasks. Especially if poll times are in the range of minutes.
If hadoop put effort into making small tasks time efficient, I think your argument has merit, if there's a reasonable chance of actually needing to scale, or to pick up ancillary benefits (fault tolerance, access to other data that needs to be processed with hadoop etc)
No comments yet
Contribute on Hacker News ↗