Comment by jauco
14 hours ago
You’d end up implementing your own home grown version of hash join and query pushdown (skipping parquet row groups entirely) etc and your own home grown heuristics in selecting the right one (planning)
Which can outperform a generic solution like this of course, but it’s not less work to make faster for most cases.
Also duckdb can give you access to an in memory representation (e.g. `fetch_arrow_table()`) so you have less “language data structure wrapping” overhead. And you can do filtering yourself on that. In most cases the “where” statements will win though.
No comments yet
Contribute on Hacker News ↗