← Back to context

Comment by st-keller

1 day ago

„This renders the meaning of significance-testing unclear; it is calculating precisely the odds of the data under scenarios known a priori to be false.“

I cannot see the problem in that. To get to meaningful results we often calculate with simplyfied models - which are known to be false in a strict sense. We use Newtons laws - we analyze electric networks based on simplifications - a bank-year used to be 360 days! Works well.

What did i miss?

The problem is basically that you can always buy a significant result with money (large enough N always leads to ”significant” result). That’s a serious issue if you see research as pursuit of truth.

Back when I wrote a loan repayment calculator, there were 47 common different ways to 'day count' (used in calculating payments for incomplete repayment periods, e.g in monthly payments, what is the 1st-13th of aug 2025 as a fraction of aug 2025?).

There is a known maximum error introduced by those simplifications. Put the other way around, Einstein is a refinement of Newton. Special relativity converges towards Newtonian motion for low speeds.

You didn't really miss anything. The article is incomplete, and wrongly suggests that something like "false" even exists in statistics. But really something is only false "with a x% probability of it actually being true nonetheless". Meaning that you have to "statistic harder" if you want to get x down. Usually the best way to do that is to increase the number of tries/samples N. What the article gets completely wrong is that for sufficiently large N, you don't have to care anymore, and might as well use false/true as absolutes, because you pass the threshold of "will happen once within the lifetime of a bazillion universes" or something.

Problem is, of course, that lots and lots of statistics are done with a low N. Social sciences, medicine, and economy are necessarily always in the very-low-N range, and therefore always have problematic statistics. And try to "statistic harder" without being able to increase N, thereby just massaging their numbers enough to get a desired conclusion proved. Or just increase N a little, claiming to have escaped the low-N-problem.

  • A frequentist interpretation of inference assumes parameters have fixed, but unknown values. In this paradigm, it is sensible to speak of the statement "this parameter's value is zero" as either true or false.

    I do not think it is accurate to portray the author as someone who does not understand asymptotic statistics.

    • > it is sensible to speak of the statement "this parameter's value is zero" as either true or false.

      Nope. The correct way is rather something like "the measurements/polls/statistics x ± ε are consistent with this parameter's true value to be zero", where x is your measured value and ε is some measurement error, accuracy or statistical deviation. x will never really be zero, but zero can be within an interval [x - ε; x + ε].

      1 reply →

It's a quantitative problem. How big is the error introduced by the simplification?