← Back to context

Comment by otsaloma

4 days ago

Agreed, never had a problem with the speed of anything NumPy or Arrow based.

Here's my alternative: https://github.com/otsaloma/dataiter https://dataiter.readthedocs.io/en/latest/_static/comparison...

Planning to switch to NumPy 2.0 strings soon. Other than that I feel all the basic operations are fine and solid.

Note for anyone else rolling up their sleeves: You can get quite far with pure Python when building on top of NumPy (or maybe Arrow). The only thing I found needing more performance was group-by-aggregate, where Numba seems to work OK, although a bit difficult as a dependency.