Comment by malshe
4 years ago
Yes, the author has shared the link to R package here:
https://cran.r-project.org/web/packages/XICOR/index.html
Edit: R code from Dr. Chatterjee's Stanford page is here - https://souravchatterjee.su.domains//xi.R
If you have never worked with R, the code seems clunky so I suggest checking out Python implementation on Github here:
https://github.com/czbiohub/xicor
The Python library is not from the original author though. But it's easy to read the code and it works with pandas as well.
If anyone is interested, I've also published a Go implementation [1] of the code for float64 slices.
Results seem to exactly match the R and Python implementation, so there will be a second pass focusing on performance, stability and support for categorical variables.
[1] https://github.com/tpaschalis/xicor-go
The current version of the python lib seems to be extremely badly written code. Or is the algo so bad ? Takes something like 21s to compute the correlation for just 10k samples.
This issue contains simple code that is claimed to be >300x faster: https://github.com/czbiohub/xicor/issues/17
Thanks for locating the solution. I didn’t check the Python code myself so I wasn’t sure what was going on with the slow processing
Thanks, the Python code is very clear and simple and makes it super easy to understand the idea without having to digest the paper.