Comment by ignaciovdk
6 days ago
Hey there, great questions.
The benchmarks weren’t run on the same machine as MinIO, but on the same network, connected over a 1 Gbps switch, so there’s a bit of real network latency, though still close to local-disk performance.
We’ve also tried a true remote setup before (compute around ~160 ms away from AWS S3). I plan to rerun that scenario soon and publish the updated results for transparency.
Regarding “hot vs. cold” data, Arc doesn’t maintain separate tiers in the traditional sense. All data lives in the S3-compatible storage (MinIO or AWS S3), and we rely on caching for repeated query patterns instead of a separate local tier.
In practice, Arc performs better than ClickHouse when using S3 as the primary storage layer. ClickHouse can scan faster in pure analytical workloads, but Arc tends to outperform it on time-range–based queries (typical in observability and IoT).
I’ll post the new benchmark numbers in the next few days, they should give a clearer picture of the trade-offs.
No comments yet
Contribute on Hacker News ↗