Comment by audiometry

4 years ago

That’s sad, as when you find yourself iterating over rows in pandas you’re almost invariably doing some wrong or very very sub optimally.

To me it's an means to an end. I don't care if my solution takes 100ms instead of 1ms, it's the superior choice for me if it takes me 1 minute to do it instead of 10 minutes to learn something new.

  • True, but sometimes these 10 minutes help you to discover something new that will improve your code.

    I had a few of these cases in my life:

    - discovering optimized patterns in Perl, which led to code I could not understand the next day

    - discovering decorators in Python, which led to better code

    - discovering comprehensions in Python (a magical thing) that led to better code, except when I wanted to be too clever and ended up with Perl-like code

I iterate over rows in pandas fairly often for plotting purposes. Anytime I want to draw something more complicated than a single point for each row, I find it's simple and straight-forward to just iterrows() and call the appropriate matplotlib functions for each. It does mean some plots that are conceptually pretty simple end up taking ~5 seconds to draw, but I don't mind. Is there really a better alternative that isn't super complicated? Keep in mind that I frequently change my mind about what I'm plotting, so simple code is really good (it's usually easier to modify) even if it's a little slower.

>That’s sad, as when you find yourself iterating over rows in pandas you’re almost invariably doing some wrong or very very sub optimally.

Humans writing code is suboptimal. I can't wait for the day when robots/AI do it for us. I just hope it leads to a utopia and not a dystopia.

I'm glad that DataFrames don't iterate by default. It's good design to make suboptimal features hard to access.