Comment by jazzkingrt
4 years ago
Can you explain in more detail why this can't be updated online?
We can track the rank of newly sampled pairs in LogN using something like an order statistic tree:
https://en.wikipedia.org/wiki/Order_statistic_tree
But I guess the problem is that with each new pair, O(n) pairs could change their contribution to the correlation coefficient.
Your point is very valid one. I was not able to find out how to keep ranks updated efficiently.
But now we're moving away from being simple to compute.