Comment by ploynog
2 days ago
Double-blind review is a mirage that does not hold up. While I was in academia I reviewed a paper that turned out to be a blatant case of plagiarism. It was a clear Level 1 copy according to the IEEE plagiarism levels (Uncredited Verbatim Copy of more than 50% of a single paper). I submitted all of these findings with the original paper and what parts were copied (essentially all of it) as my review.
A few days later I got an email from the author (some professor) who wanted to discuss this with me, claiming that the paper was written by some of his students who were not credited as authors. They were unexperienced, made a mistake, yaddah yaddah yaddah. I forwarded the mail to the editors and never heard from this case again. I don't expect that anything happened, the corrective actions for a level-1 violation are pretty harsh and would have been hard to miss.
The fact that this person was able to obtain my name and contact info shattered any trust I had in the "blind" part of the double-blind review process.
The other two reviewers had recommended to accept the paper without revisions, by the way.
This seems like an issue of administration rather than an issue with the idea of a double-blind review. If you conduct a review that isn't properly blinded, and doesn't have an observable effect, can it really be called a double blind review?
Maybe more that a non-idealistic model of the real world, and common direct experience, show that incentives strongly favor an administrative approach that compromises the double blind.
Unless there's a better way to do it, I this shows a need for better structures for governance and auditing of review boards... Information and science care not for our human folly, it's up to us to seek and execute them properly.
I remember attending ACL one year, where the conference organisers ran an experiment to test the effectiveness of double blind reviews. They asked reviewers to identify the institution that submitted the anonymised paper. Roughly 50% of reviewers were able to correctly identify the institutions. I think there was a double digit percentage of being able to predict authors.
The organisers then made the argument that double blind was working because 50% of papers were not identified correctly! I was amazed that even with strong evidence that double blind was not working, the organisers were still able to convince themselves to continue with business as usual.
You're saying "not working" when you only have presented evidence for "not perfect".
That experiment showed that even when asked to put effort into identifying the source of an anonymized paper—something that most reviewers probably don't put any conscious effort into normally—the anonymization was having a substantial effect compared to not anonymizing the papers.
Am I missing some obvious reason why double-blind reviews should only be attempted if the blinding can be achieved with a near-perfect success rate, or are you just setting the bar unreasonably high?
The subtext to this whole comment chain is that you need to have hands-on experience with qualitative to quantitative conversions if you want to reason about the scientific process.
> Am I missing some obvious reason why double-blind reviews should only be attempted if the blinding can be achieved with a near-perfect success rate, or are you just setting the bar unreasonably high?
OP thinks you are looking at either signal or noise, instead of determining where the signal begins for yourself.
When we criticize without proposing a fix or alternative, we promote the implicit alternative of tearing something down without fixing it. This is often much worse than letting the imperfect thing stand. So here's a proposal: do what we do in software.
No, really: we have the same problem in software. Software developers under high pressure to move tickets will often resort to the minor fraud of converting unfinished features into bugs by marking them complete when they are not in fact complete. This is very similar to the minor fraud of an academic publishing an overstated / incorrect result to stay competitive with others doing the same. Often it's more efficient in both cases to just ignore the problem, which will generally self-correct with time. If not, we have to think about intervention -- but in software this story has played out a thousand times in a thousand organizations, so we know what intervention looks like.
Acceptance testing. That's the solution. Nobody likes it. Companies don't like to pay for the extra workers and developers don't like the added bureaucracy. But it works. Maybe it's time for some fraction of grant money to go to replication, and for replication to play a bigger role in gating the prestige indicators.
> This is very similar to the minor fraud of an academic publishing an overstated / incorrect result to stay competitive with others doing the same.
I completely disagree.
For one, academic standards of publishing are not at all the same as the standards for in-house software development. In academia, a published result is typically regarded as a finished product, even if the result is not exhaustive. You cannot push a fix to the paper later; an entirely new paper has to be written and accepted. And this is for good reason: the paper represents a time-stamp of progress in the field that others can build off of. In the sciences, projects can range from 6 months to years, so a literature polluted with half-baked results is a big impediment to planning and resource allocation.
A better comparison for academic publishing would be a major collaborative open source project like the Linux kernel. Any change has to be thoroughly justified and vetted before it is merged because mistakes cause other people problems and wasted time/effort. Do whatever you like with your own hobbyist project, but if you plan for it to be adopted and integrated into the wider software ecosystem, your code quality needs to be higher and you need to have your interfaces speced out. That's the analogy for academic publishing.
The problems in modern academic publishing are almost entirely caused by the perverse incentives of measuring academic status by publication record (number of publications and impact factor). Lowering publishing standards so academics can play this game better is solving the wrong problem. Standards should be even higher.
Yeah, the alternative to a double-blind review that isn't a double-blind review is a double blind review.
The alternative to not enforcing existing rules against plagiarism is to enforce them.
The alternative to ignoring integrity issues i.e."minor fraud" in the workplace is to apply ordinary workplace discipline on them.
Seems to me that the review worked, you caught the plagiarism, even though the other two missed it. It's disturbing that somehow the paper author found your contact information though!