Comment by ssivark
1 day ago
Setting aside complaints about the Pandas API, it's frustrating that we might see the community of a popular "standard" tool fragment into two or even three ecosystems (for libraries with slightly incompatible APIs) -- seemingly all with the value proposition of "making it faster". Based on the machine learning experience over the last decade, this kind of churn in tooling is somewhat exhausting.
I wonder how much of this is fundamental to the common approach of writing libraries in Python with the processing-heavy parts delegated to C/C++ -- that the expressive parts cannot be fast and the fast parts cannot be expressive. Also, whether Rust (for polars, and other newer generation of libraries) changes this tradeoff substantially enough.
I think it's a natural path of software life that compatibility often stands in the way of improving the API.
This really does seem like a rare thing that everything speeds up without breaking compatability. If you want a fast revised API for your new project (or to rework your existing one) then you have a solution for that with Polars. If you just want your existing code/workloads to work faster, you have a solution for that now.
It's OK to have a slow, compatible, static codebase to build things on then optimize as-needed.
Trying to "fix" the api would break a ton of existing code, including existing plugins. Orphaning those projects and codebases would be the wrong move, those things take a decade to flesh out.
This really doesn't seem like the worst outcome, and doesn't seem to be creating a huge fragmented mess.
> Based on the machine learning experience over the last decade, this kind of churn in tooling is somewhat exhausting.
Don't come to old web-devs with those complains, every single one of them had to write at least one open source javascript library just to create their linkedin account!