Comment by rdedev
3 hours ago
While polars is better if you work with predefined data formats, pandas is imo still better as a general purpose table container.
I work with chemical datasets and this always involves converting SMILES string to Rdkit Molecule objects. Polars cannot do this as simply as calling .map on pandas.
Pandas is also much better to do EDA. So calling it worse in every instance is not true. If you are doing pure data manipulation then go ahead with polars
Map is one operation pandas does nicely that most other “wrap a fast language” dataframe tools do poorly.
When it feels like you’re writing some external udf thats executed in another environment, it does not feel as nice as throwing in a lambda, even if the lambda is not ideal.
you have map_elements in polars which does exactly this.
https://docs.pola.rs/api/python/dev/reference/expressions/ap...
You can also iter_rows into a lambda if you really want to.
https://docs.pola.rs/api/python/stable/reference/dataframe/a...
Personally I find it extremely rare that I need to do this given Polars expressions are so comprehensive, including when.then.otherwise when all else fails.