Comment by riku_iki

3 days ago

> But that’s not the typical Lakehouse use case.

that benchmark is also not typical lakehouse use case, since all data is hosted locally, so they don't test significant component of the stack.

Yeah, that’s one of many issues with Clickbench. It’s also one table so it can’t test joins.

TPC-H is okay but not Lakehouse specific. I’m not aware of any benchmarks that specifically test performance of engines under common setups like external storage or scalable compute. It would be hard to design one that’s easily reproducible. (And in fairness to Clickbench, it’s intentionally simple for that exact reason - to generate a baseline score for any query engine that can query tabular data).