Comment by esafak

2 years ago

What is the largest data set people here are processing daily for ETL on one machine? What tools are you using, and what does the job do? I want to know how capable new libraries like polars are, and how far you can delay transitioning to Spark. Are terabyte datasets feasible yet?