Comment by anktor

3 days ago

It's been mentioned but I want to add that the original idea of the post (mid size VPS hosting apache spark) might be missing that spark is ideal for distributed and resilient work (if a node fails the framework is able to avoid losing that work).

If you don't need this features, specially the distributed one, going tall (single instance with high capacity, replicate when necessary) or going simpler (multiple servers but without spark coordinating the work) could be good options depending on your/the team's knowledge