← Back to context

Comment by foltik

3 months ago

I was curious and read through the paper you linked. Here's my shot at rational thinking. A few things stood out:

1. Arbitrary prior

In the peer-review notes on p.26, a reviewer questions the basis of their bayesian prior: "they never clearly wrote down ... that the theoretical GZ effect size would be "Z/sqrt(N) = 0.1"

The authors reply: "The use of this prior in the Bayesian meta-analysis is an arbitrary choice based on the overall frequentist meta-analysis, and the previous meta-analyses e.g. Storm & Tressoldi, 2010."

That's a problem because a bayesian prior represents your initial belief about the true effect before looking at the current data. It's supposed to come from independent evidence or theoretical reasoning. Using the same dataset or past analyses of the same studies to set the prior is just circular reasoning. In other words, they assumed from the start that the true effect size was roughly 0.1, then unsurprisingly "found" an effect size around 0.08–0.1.

2. Publication bias

On p. 10, the authors admit that "for publication bias to attenuate (to "explain away") the observed overall effect size, affirmative results would need to be at least four-fold more likely to be published than non-affirmative results."

A modest 4x preference to publish positive results would erase the significance.

They do claim "the similarity of effect size between the two levels of peer-review add further support to the hypothesis that the 'file drawer' is empty"

But that's faulty reasoning. publication bias concerns which studies get published at all; comparing conferences vs. journals only looks at already published work.

Additionally, their own inclusion criteria are "peer reviewed and not peer-reviewed studies e.g., published in proceedings excluding dissertations." They explicitly removed dissertations and other gray literature, the most common source of null findings, further increasing the prior for the true publication bias in their dataset.

4. My analysis

With the already tiny effect size they report of Z/sqrt(N) = 0.08 (CI .04-.12) on p.1 and p.7, the above issues are significant. An arbitrary prior and a modest, unacknowledged publication bias could easily turn a negligible signal into an apparently "statistically significant" effect. And because the median statistical power of their dataset on p.10 is only 0.088, nearly all included studies were too weak to detect any real effect even if one existed. In that regime, small analytic or publication biases dominate the outcome.

Under more careful scrutiny, what looks like evidence for psi is just the echo of their own assumptions amplified by selective visibility.