Comment by ants_everywhere

1 day ago

Which means that statistical significance is really a measure of whether N is big enough

6 comments

ants_everywhere

This has been known ever since the beginning of frequentist hypothesis testing. Fisher warned us not to place too much emphasis on the p-value he asked us to calculate, specifically because it is mainly a measure of sample size, not clinical significance.

ants_everywhere 1 day ago

Yes the whole thing has been a bit of a tragedy IMO. A minor tragedy all things considered, but still one nonetheless.
One interesting thing to keep in mind is that Ronald Fisher did most of his work before the publication of Kolmogorov's probability axioms (1933). There's a real sense in which the statistics used in social sciences diverged from mathematics before the rise of modern statistics.
So there's a lot of tradition going back to the 19th century that's misguided, wrong, or maybe just not best practice.

energy123 1 day ago

It's not, that would be quite the misunderstanding of statistical power.

N being big means that small real effects can plausibly be detected as being statistically significant.

It doesn't mean that a larger proportion of measurements are falsely identified as being statistically significant. That will still occur at a 5% frequency or whatever your alpha value is, unless your null is misspecified.

ants_everywhere 1 day ago
It's standard to set the null hypothesis to be a measure zero set (e.g. mu = 0 or mu1 = mu2). So the probability of the null hypothesis is 0 and the only question remaining is whether your measurement is good enough to detect that.
But even though you know the measurement can't be exactly 0.000 (with infinitely many decimal places) a priori, you don't know if your measurement is any good a priori or whether you're measuring the right thing.
- energy123 13 hours ago
  
  The probability is only zero a.s., it's not zero. That's a very big difference. And hypothesis tests aren't estimating the probability of the null being true, they're estimating the probability of rejecting the null if the null was true.
  
  1 reply →