Comment by analog31

6 months ago

In my experience, there are people out there who don't program, or who don't feel that it's a productive way of doing things. I'm firmly in the Python camp, but recognize that my workplace has several JMP licenses, and the majority of engineers are satisfied with Excel. And I never let anybody see how long it takes me to do things. ;-)

However, those people also belong to the most-of-the-world who are still leery of "open source" or anything that doesn't come from a known brand.

This thing could be an option for someone who wants to mess around with data but isn't comfortable mentioning it to the boss until they see for themselves if it's worthwhile.

What Python libraries do you prefer? Even after doing this for years, I have trouble making anything remotely complicated in matplotlib without at least one look at the documentation.

For data viz, I'm absolutely smitten with R and ggplot. It works the same way as my brain, "OK I want to use the students dataset, specifically the age variable, I want to make a histogram, and I'd like to label the axes." You build the viz in that order, with one function call for each thought.

  • I actually use straight matplotlib, or for quick-and-dirty, pyplot. Every notebook starts with the same boilerplate, turning on auto-reload, then numpy, pyplot, asdf. And then my own weird libraries, or those shared with colleagues. Occasionally OpenCV, pyserial, sympy, and other odds and ends.

    I have a Python "wrapper" for every piece of lab equipment that I touch.

    I'm a physicist, and I work on developing measurement equipment. My graphing needs tend to be simplistic, with a big factor being the ability to visualize something quickly and then plan the next step (or realize I screwed up and start over). I'm often the only reader of my graphs.

    My work is all secret, so I don't publish, except an occasional patent. The graphing needs for patents are their own beast, arcane, and perhaps a bit repulsive.

    I noticed your comment suggests a more "life science" interest, and I think those fields may place a heavier burden on visualization. So I wouldn't be shocked if the physical and life sciences had different graphing needs. I suspect pyplot has a closer vibe to what you're using, than straight matplotlib, but maybe not close enough. There have been attempts to wrap mpl in a ggplot-like interface, but I don't know how successfully.