Comment by rors
2 days ago
I remember attending ACL one year, where the conference organisers ran an experiment to test the effectiveness of double blind reviews. They asked reviewers to identify the institution that submitted the anonymised paper. Roughly 50% of reviewers were able to correctly identify the institutions. I think there was a double digit percentage of being able to predict authors.
The organisers then made the argument that double blind was working because 50% of papers were not identified correctly! I was amazed that even with strong evidence that double blind was not working, the organisers were still able to convince themselves to continue with business as usual.
You're saying "not working" when you only have presented evidence for "not perfect".
That experiment showed that even when asked to put effort into identifying the source of an anonymized paper—something that most reviewers probably don't put any conscious effort into normally—the anonymization was having a substantial effect compared to not anonymizing the papers.
Am I missing some obvious reason why double-blind reviews should only be attempted if the blinding can be achieved with a near-perfect success rate, or are you just setting the bar unreasonably high?
The subtext to this whole comment chain is that you need to have hands-on experience with qualitative to quantitative conversions if you want to reason about the scientific process.
> Am I missing some obvious reason why double-blind reviews should only be attempted if the blinding can be achieved with a near-perfect success rate, or are you just setting the bar unreasonably high?
OP thinks you are looking at either signal or noise, instead of determining where the signal begins for yourself.