Comment by data-ottawa
11 days ago
Map is one operation pandas does nicely that most other “wrap a fast language” dataframe tools do poorly.
When it feels like you’re writing some external udf thats executed in another environment, it does not feel as nice as throwing in a lambda, even if the lambda is not ideal.
you have map_elements in polars which does exactly this.
https://docs.pola.rs/api/python/dev/reference/expressions/ap...
You can also iter_rows into a lambda if you really want to.
https://docs.pola.rs/api/python/stable/reference/dataframe/a...
Personally I find it extremely rare that I need to do this given Polars expressions are so comprehensive, including when.then.otherwise when all else fails.
That one has a bit more friction than pandas because the return schema requirement -- pandas let's you get away with this bad practice.
It also does batches when you declare scalar outputs, but you can't control the batch size, which usually isn't an issue, but I've run into situations where it is.