← Back to context

Comment by donor20

6 days ago

And the thing is - single node can still scale ridiculously high without the orchestration overheads of distributed stuff.

You can do dual AMD 192 core CPU's (384 cores / 768 threads) with 9 TB of memory and a 24 disk SSD array in a 2U box.

The vast majority of businesses will never need more than single node architecture. Hardware advances are continually increasing that percentage.

SPARK and its modern counterpart Databricks are essentially obsolete for these organizations. Whatever justification they may have had in the past is no longer true.

I’ve recently closed down several in house SPARK clusters and replaced them with single nodes.

In addition to the simplicity of the design and reduction in cost there was a massive increase in performance. I expect this will become more common in the future; leaving distributed architecture for a small and increasingly niche group.