Comment by jklinger410

1 day ago

> the content of the papers themselves are not necessarily invalidated. For example, authors may have given an LLM a partial description of a citation and asked the LLM to produce bibtex (a formatted reference)

Maybe I'm overreacting, but this feels like an insanely biased response. They found the one potentially innocuous reason and latched onto that as a way to hand-wave the entire problem away.

Science already had a reproducibility problem, and it now has a hallucination problem. Considering the massive influence the private sector has on the both the work and the institutions themselves, the future of open science is looking bleak.

I found at least one example[0] of authors claiming the reason for the hallucination was exactly this. That said, I do think for this kind of use, authors should go to the effort of verifying the correctness of the output. I also tend to agree with others who have commented that while a hallucinated citation or two may not be particularly egregious, it does raise concerns about what other errors may have been missed.

[0] https://openreview.net/forum?id=IiEtQPGVyV&noteId=W66rrM5XPk

The wording is not hand-wavy. They said "not necessarily invalidated", which could mean that innocuous reason and nothing extra.

  • I really think it is. The primary function of these publications is to validate science. When we find invalid citations, it shows they're not doing their job. When they get called on that, they cite the volume of work their publication puts out and call out the only potential not-disqualifying outcome.

    Seems like CYA, seems like hand wave. Seems like excuses.

  • Even if some of those innocuous mistakes happen, we'll all be better off if we accept people making those mistakes as acceptable casualties in an unforgiving campaign against academic fraudsters.

    It's like arguing against strict liability for drunk driving because maybe somebody accidentally let their grape juice sit to long and they didn't know it was fermented... I can conceive of such a thing, but that doesn't mean we should go easy on drunk driving.

I don’t read the NeurIPS statement as malicious per se, but I do think it’s incomplete

They’re right that a citation error doesn’t automatically invalidate the technical content of a paper, and that there are relatively benign ways these mistakes get introduced. But focusing on intent or severity sidesteps the fact that citations, claims, and provenance are still treated as narrative artifacts rather than things we systematically verify

Once that’s the case, the question isn’t whether any single paper is “invalid” but whether the workflow itself is robust under current incentives and tooling.

A student group at Duke has been trying to think about with Liberata, i.e. what publishing looks like if verification, attribution, and reproducibility are first class rather than best effort

They have a short explainer here that lays out the idea if useful context helps: https://liberata.info/

Isn't disqualifying X months of potentially great research due to a misformed, but existing reference harsh? I don't think they'd be okay with references that are actually made up.

  • When your entire job is confirming that science is valid, I expect a little more humility when it turns out you've missed a critical aspect.

    How did these 100 sources even get through the validation process?

    > Isn't disqualifying X months of potentially great research due to a misformed, but existing reference harsh?

    It will serve as a reminder not to cut any corners.

    • > When your entire job is confirming that science is valid, I expect a little more humility when it turns out you've missed a critical aspect.

      I wouldn't call a misformed reference a critical issue, it happens. That's why we have peer reviews. I would contend drawing superficially valid conclusions from studies through use of AI is a much more burning problem that speaks more to the integrity of the author.

      > It will serve as a reminder not to cut any corners.

      Or yet another reason to ditch academic work for industry. I doubt the rise of scientific AI tools like AlphaXiv [1], whether you consider them beneficial or detrimental, can be avoided - calling for a level pragmatism.

      2 replies →

  • Science relies on trust.. a lot. So things which show dishonesty are penalised greatly. If we were to remove trust then peer reviewing a paper might take months of work or even years.

    • And that timeline only grows with the complexity of the field in question. I think this is inherently a function of the complexity of the study, and rather than harshly penalizing such shortcomings we should develop tools that address them and improve productivity. AI can speed up the verification of requirements like proper citations, both on the author's and reviewer's side.

    • Math does that. Peer review cycles are measured in years there. This does not stop fashionable subfields from publishing sloppy papers, and occasionally even irrecoverably false ones.