Comment by faizshah
5 days ago
Pandas is a commonly known DSL at this point so lots of data scientists know pandas like the back of their hand and thats why a lot of pandas but for X libraries have become popular.
I agree that pandas does not have the best designed api in comparison to say dplyr but it also has a lot of functionality like pivot, melt, unstack that are often not implemented by other libraries. It’s also existed for more than a decade at this point so there’s a plethora of resources and stackoverflow questions.
On top of that, these days I just use ChatGPT to generate some of my pandas tasks. ChatGPT and other coding assistants know pandas really well so it’s super easy.
But I think if you get to know Pandas after a while you just learn all the weird quirks but gain huge benefits from all the things it can do and all the other libraries you can use with it.
I've been living in the shadow of pandas for about a decade now, and the only thing I learned is to avoid using it.
I 100% agree that pandas addresses all the pain points of data analysis in the wild, and this is precisely why it is so popular. My point is, it doesn't address them well. It seems like a conglomerate of special cases, written for a specific problem it's author was facing, with little concern for consistency, generality or other use cases that might arise.
In my usage, any time saved by its (very useful) methods tends to be lost on fixing subtle bugs introduced by strange pandas behaviours.
In my use cases, I reindex the data using pandas and get it to numpy arrays as soon as I can, and work with those, with a small library of utilities I wrote over the years. I'd gladly use a "sane pandas" instead.
Aye, but we've learned it, we've got code bases written in it, many of us are much more data kids than "real devs".
I get it doesn't follow best practices, but it does do what it needs to. Speed has been an issue, and it's exciting seeing that problem being solved.
Interesting to see so many people recently saying "polars looks great, but no way I'll rewrite". This library seems to give a lot of people, myself included, exactly what we want. I look forward to trying it.