Comment by nazgul17

14 hours ago

Worth pointing out that calculating p-values on a wide set of metrics and selecting for those under $threshold (called p-hacking) is not statistically sound - who cares, we are not an academic journal, but a pill of knowledge.

The idea is, since data has a ~1/20 chance of having a p < 0.05, you are bound to get false positives. In academia it's definitely not something you'd do, but I think here it's fine.

@OP have you considered calculating Cohen's effect size? p only tells us that, given the magnitude of the differences and the number of samples, we are "pretty sure" the difference is real. Cohen's `d` tells us how big the difference is on a "standard" scale.

1 comment

nazgul17

flowerthoughts 4 hours ago

> The idea is, since data has a ~1/20 chance of having a p < 0.05

Are you saying p is uniformly distributed over any data set? That doesn't jive with my limited understanding of entropy. What's this based on?