Comment by geocar

10 days ago

> I can’t think of a single time I’ve needed a sorted list of only numbers.

Gosh. Let me try to convince you.

I use permutation arrays all the time: lists of indexes that can be used across multiple vectors.

This is much faster than the pattern of scanning rows, constructing tuples of (thingToSort . thingIWantInThatOrder) and making a custom sort function, and destructuring those tuples...

And really, not having to write custom sort functions is really really nice.

> Especially in telemetry, where mean is easy and median is not.

Funny. Yes median is obvious with a permutation array, and maybe mean is less so.

When your data is really big and not very variable, mean of x is roughly the same as the mean of any sufficient sample of x, and that sample can be meaningfully represented as a permutation array!

You can get such an array with reservoir sampling and some maths, and (depending on what you know of your data and variance) sometimes even simpler tricks.

That's kindof actually how the "faster than dijkstra" trick referred-to in the article works: Data sets with small variance has this same property that the min of x is roughly the same as the min of a sufficient sample of x (where the size of sufficient has to do with the variance). And so on.

Another big use-case in my code is trees: Apter trees have a flat memory layout which is convenient for permutation arrays which can simultaneously represent index, rotation, tombstones, and all sorts of other things you might need to do with a tree.

Give it a dig. There's good stuff in there.