Comment by WCSTombs

14 hours ago

If your arrays have more than two dimensions, please consider using Xarray [1], which adds dimension naming to NumPy arrays. Broadcasting and alignment then becomes automatic without needing to transpose, add dummy axes, or anything like that. I believe that alone solves most of the complaints in the article.

Compared to NumPy, Xarray is a little thin in certain areas like linear algebra, but since it's very easy to drop back to NumPy from Xarray, what I've done in the past is add little helper functions for any specific NumPy stuff I need that isn't already included, so I only need to understand the NumPy version of the API well enough one time to write that helper function and its tests. (To be clear, though, the majority of NumPy ufuncs are supported out of the box.)

I'll finish by saying, to contrast with the author, I don't dislike NumPy, but I do find its API and data model to be insufficient for truly multidimensional data. For me three dimensions is the threshold where using Xarray pays off.

[1] https://xarray.dev

Xarray is great. It marries the best of Pandas with Numpy.

Indexing like `da.sel(x=some_x).isel(t=-1).mean(["y", "z"])` makes code so easy to write and understand.

Broadcasting is never ambiguous because dimension names are respected.

It's very good for geospatial data, allowing you to work in multiple CRSs with the same underlying data.

We also use it a lot for Bayesian modeling via Arviz [1], since it makes the extra dimensions you get from sampling your posterior easy to handle.

Finally, you can wrap many arrays into datasets, with common coordinates shared across the arrays. This allows you to select `ds.isel(t=-1)` across every array that has a time dimension.

[1] https://www.arviz.org/en/latest/

Seconded. Xarray has mostly replaced bare NumPy for me and it makes me so much more productive.

Is there anything similar to this for something like Tensorflow, Keras or Pytorch? I haven't used them super recently, but in the past I needed to do all of the things you just described in painful to debug ways.

Thanks for sharing this library. I will give it a try.

For a while I had a feeling that I was perhaps a little crazy for seeming to be only person to really dislike the use of things like ‘array[:, :, None]’ and so forth.

Life goes full circle sometimes. I remember that Numpy roughly came out of the amalgamation of the Numeric and Numarray libraries; I want to imagine that the Numarray people kept fighting these past 20 years to prove they were the right solution, at some point took some money from Elon Musk and renamed to Xarray [0], and finally started beating Numpy.

[0] most of the above is fiction