Comment by parpfish
1 day ago
In theory, asking grad students and early career folks to run replications would be a great training tool.
But the problem isn’t just funding, it’s time. Successfully running a replication doesn’t get you a publication to help your career.
Grad students don’t get to publish a thesis on reproduction. Everyone from the undergraduate research assistant to the tenured professor with research chairs are hyper focused on “publishing” as much “positive result” on “novel” work as possible
Publishing a replication could be a prerequisite to getting the degree
The question is, how can universities coordinate to add this requirement and gain status from it
Prerequisite required by who, and why is that entity motivated to design such a requirement? Universities also want more novel breakthrough papers to boast about and to outshine other universities in the rankings. And if one is honest, other researchers also get more excited about new ideas than a failed replication that may for a thousand different reasons and the original authors will argue you did something wrong, or evaluated in an unfair way, and generally publicly accusing other researchers of doing bad work won't help your career much. It's a small world, you'd be making enemies with people who will sit on your funding evaluation committees, hiring committees and it just generally leads to drama. Also papers are superseded so fast that people don't even care that a no longer state of the art paper may have been wrong. There are 5 newer ones that perform better and nobody uses the old one. I'm just stating how things actually are, I don't say that this is good, but when you say something "should" happen, think about who exactly is motivated to drive such a change.
2 replies →
I think Arxiv and similar could contribute positively by listing replications/falsifications, with credit to the validating authors. That would be enough of an incentive for aspiring researchers to start making a dent.
But that seems almost trivially solved. In software it's common to value independent verification - e.g. code review. Someone who is only focused on writing new code instead of careful testing, refactoring, or peer review is widely viewed as a shitty developer by their peers. Of course there's management to consider and that's where incentives are skewed, but we're talking about a different structure. Why wouldn't the following work?
A single university or even department could make this change - reproduction is the important work, reproduction is what earns a PhD. Or require some split, 20-50% novel work maybe is also expected. Now the incentives are changed. Potentially, this university develops a reputation for reliable research. Others may follow suit.
Presumably, there's a step in this process where money incentivizes the opposite of my suggestion, and I'm not familiar with the process to know which.
Is it the university itself which will be starved of resources if it's not pumping out novel (yet unreproducible) research?
> Presumably, there's a step in this process where money incentivizes the opposite of my suggestion, and I'm not familiar with the process to know which.
> Is it the university itself which will be starved of resources if it's not pumping out novel (yet unreproducible) research?
Researchers apply for grants to fund their research, the university is generally not paying for it and instead they receive a cut of the grant money if it is awarded (IE. The grant covers the costs to the university for providing the facilities to do the research). If a researcher could get funding to reproduce a result then they could absolutely do it, but that's not what funds are usually being handed out for.
Universities are not really motivated to slow down the research careers of their employees, on the contrary. They are very much interested in their employees making novel, highly cited publications and bringing in grants that those publications can lead to.
> In software it's common to value independent verification - e.g. code review. Someone who is only focused on writing new code instead of careful testing, refactoring, or peer review is widely viewed as a shitty developer by their peers.
That is good practice
It is rare, not common. Managers and funders pay for features
Unreliable insecure software sells very well, so making reliable secure software is a "waste of money", generally
That.. still requires funding. Even if your lab happens to have all the equipment required to replicate you're paying the grad student for their time spent on replicating this paper and you'll need to buy some supplies; chemicals, animal subjects, pay for shared equipment time, etc.
You may well know this, but I get the sense that it isn’t necessarily common knowledge, so I want to spell it out anyway:
In a lot of cases, the salary for a grad student or tech is small potatoes next to the cost of the consumables they use in their work.
For example,I work for a lab that does a lot of sequencing, and if we’re busy one tech can use 10k worth of reagents in a week.
We are on the comment section about an AI conference and up until the last few years material/hardware costs for computer science research was very cheap compare to other sciences like medicine, biology etc. where they use bespoke instruments and materials. In CS, up until very recently, all you needed was a good consumer PC for each grad student that lasted for many years. Nowadays GPU clusters are more needed but funding is generally not keeping up with that, so even good university labs are way underresourced on this front.
Enough people will falsify the replication and pocket the money, taking you back to where you were in the first place and poorer for it. The loss of trust is an existential problem for the USA.
There is actually a ton of replication going on at any given moment, usually because we work off of each other's work, whether those others are internal or external. But, reporting anything basically destroys your career in the same way saying something about Weinstein before everyone's doing it does. So, most of us just default to having a mental list of people and circles we avoid as sketchy and deal with it the way women deal with creepy dudes in music scenes, and sometimes pay the troll toll. IMO, this is actually one of the reasons for recent increases in silo-ing, not just stuff being way more complicated recently; if you switch fields, you have to learn this stuff and pay your troll tolls all over again. Anyway, I have discovered or witnessed serious replication problems four times --
(1) An experiment I was setting up using the same method both on a protein previously analyzed by the lab as a control and some new ones yielded consistently "wonky" results (read: need different method, as additional interactions are implied that make standard method inappropriate) in both. I wasn't even in graduate school yet and was assumed to simply be doing shoddy work, after all, the previous work was done by a graduate student who is now faculty at Harvard, so clearly someone better trained and more capable. Well, I finally went through all of his poorly marked lab notebooks and got all of his raw data... his data had the same "wonkiness," as mine, he just presumably wanted to stick to that method and "fixed" it with extreme cherry-picking and selective reporting. Did the PI whose lab I was in publish a retraction or correction? No, it would be too embarrassing to everyone involved, so the bad numbers and data live on.
(2) A model or, let's say "computational method," was calibrated on a relatively small, incomplete, and partially hypothetical data-set maybe 15 years ago, but, well, that was what people had. There are many other models that do a similar task, by the way, no reason to use this one... except this one was produced by the lab I was in at the time. I was told to use the results of this one into something I was working on and instead, when reevaluating it on the much larger data-set we have now, found it worked no better than chance. Any correction or mention of this outside the lab? No, and even in the lab, the PI reacted extremely poorly and I was forced to run numerous additional experiments which all showed the same thing, that there was basically no context this model was useful. I found a different method worked better and subsequently, had my former advisor "forget" (for the second time) to write and submit his portion of a fellowship he previously told me to apply to. This model is still tweaked in still useless ways and trotted out in front of the national body that funds a "core" grant that the PI basically uses as a slush fund, as sign of the "core's" "computational abilities." One of the many reasons I ended up switching labs. PI is a NAS member, by the way, and also auto-rejects certain PIs from papers and grants because "he just doesn't like their research" (i.e. they pissed him off in some arbitrary way), also flew out a member of the Swedish RAS and helped them get an American appointment seemingly in exchange for winning a sub-Nobel prize for research... they basically had nothing to do with, also used to basically use various members as free labor on super random stuff to faculty who approved his grants, so you know the type.
(3) Well, here's a fun one with real stakes. Amyloid-β oligomers, field already rife with fraud. A lab that supposedly has real ones kept "purifying" them for the lab involved in 2, only for the vial to come basically destroyed. This happened multiple times, leading them to blame the lab, then shipping. Okay, whatever. They send raw material, tell people to follow a protocol carefully to make new ones. Various different people try, including people who are very, very careful with such methods and can make everything else. Nobody can make them. The answer is "well, you guys must suck at making them." Can anyone else get the protocol right? Well, not really... But, admittedly, someone did once get a different but similar protocol to work only under the influence of a strong magnetic field, so maybe there's something weird going on in their building that they actually don't know about and maybe they're being truthful. But, alternatively, they're coincidentally the only lab in the world that can make super special sauce, and everybody else is just a shitty scientist. Does anyone really dig around? No, why would a PI doing what the PI does in 2 want to make an unnecessary enemy of someone just as powerful and potentially shitty? Predators don't like fighting.
(4) Another one that someone just couldn't replicate at all, poured four years into it, origin was a big lab. Same vibe as third case, "you guys must just suck at doing this," then "well, I can't get in contact with the graduate student who wrote the paper, they're now in consulting, and I can't find their data either." No retraction or public comment, too big of a name to complain about except maybe on PubPeer. Wasted an entire R21.
Yeah, but doesn't publishing an easily falsifiable paper end one?
One, it doesnt damage your reputation as much as one would think.
But two, and more importantly, no one is checking.
Tree falls in the forest, no one hears, yadi-yada.
Here's a work from last year which was plagiarized. The rare thing about this work is it was submitted to ICLR, which opened reviews for both rejected and accepted works.
You'll notice you can click on author names and you'll get links to their various scholar pages but notably DBLP, which makes it easy to see how frequently authors publish with other specific authors.
Some of those authors have very high citation counts... in the thousands, with 3 having over 5k each (one with over 18k).
https://openreview.net/forum?id=cIKQp84vqN
<< no one is checking.
I think this is the big part of it. There is no incentive to do it even when the study can be reproduced.
The vast majority of papers is so insignifcant, nobody bothers to try and use and thereby replicate it.
But the thing is… nobody is doing the replication to falsify it. And if the did, it wouldn’t be published because it’s a null result
Not in most fields, unless misconduct is evident. (And what constitutes "misconduct" is cultural: if you have enough influence in a community, you can exert that influence on exactly where that definitional border lies.) Being wrong is not, and should not be, a career-ending move.
If we are aiming for quality, then being wrong absolutely should be. I would argue that is how it works in real life anyway. What we quibble over is what is the appropriate cutoff.
4 replies →
Not really, since nobody (for values of) ends up actually falsifying it, and if they do, it's years down the line.
Grad students have this weird habit of eating food and renting places to live, though, so that's also money