Comment by kylejrp

4 years ago

As part of the Stack Overflow April Fools' prank, we did some data analysis on copy behavior on the site [0]. The most copied answer during the collection period (~1 month) was "How to iterate over rows in a DataFrame in Pandas" [1], receiving 11k copies!

[0] https://stackoverflow.blog/2021/04/19/how-often-do-people-ac...

[1] https://stackoverflow.com/a/16476974/16476924

That’s sad, as when you find yourself iterating over rows in pandas you’re almost invariably doing some wrong or very very sub optimally.

  • To me it's an means to an end. I don't care if my solution takes 100ms instead of 1ms, it's the superior choice for me if it takes me 1 minute to do it instead of 10 minutes to learn something new.

    • True, but sometimes these 10 minutes help you to discover something new that will improve your code.

      I had a few of these cases in my life:

      - discovering optimized patterns in Perl, which led to code I could not understand the next day

      - discovering decorators in Python, which led to better code

      - discovering comprehensions in Python (a magical thing) that led to better code, except when I wanted to be too clever and ended up with Perl-like code

  • I iterate over rows in pandas fairly often for plotting purposes. Anytime I want to draw something more complicated than a single point for each row, I find it's simple and straight-forward to just iterrows() and call the appropriate matplotlib functions for each. It does mean some plots that are conceptually pretty simple end up taking ~5 seconds to draw, but I don't mind. Is there really a better alternative that isn't super complicated? Keep in mind that I frequently change my mind about what I'm plotting, so simple code is really good (it's usually easier to modify) even if it's a little slower.

  • >That’s sad, as when you find yourself iterating over rows in pandas you’re almost invariably doing some wrong or very very sub optimally.

    Humans writing code is suboptimal. I can't wait for the day when robots/AI do it for us. I just hope it leads to a utopia and not a dystopia.

  • I'm glad that DataFrames don't iterate by default. It's good design to make suboptimal features hard to access.

I got bitten by that prank when copying code from a question, to see what it did (it was something obviously harmless). I was rather annoyed for about two seconds before I realized what date it was. :)