New arXiv policy: 1-year ban for hallucinated references

15 hours ago (twitter.com)

190 comments

gjuggler

> The penalty is a 1-year ban from arXiv followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue.

This is incredibly good for science. arXiv is free, but it's a privilege not a right!

I'm not seeing this clearly listed on https://info.arxiv.org/help/policies/index.html so it's possible this is planned but not live yet - or perhaps I'm not digging deeply enough?

As a certain doctor once said: the whole point of the doomsday machine is lost if you keep it a secret!

nomel 9 hours ago
I bet, since this has been posted, someone here has already vibe coded a reference checker that they plan to put behind a subscription.
This is good for reference checking, but I doubt this will do much for the most likely shoddy science that accompanies hallucinated references.
- nradov 8 hours ago
  
  The frontier LLMs are getting pretty good at checking this sort of thing. You could prompt them to not only verify the references are real but that they actually state what the article claims. Some human review will still be needed but I'll bet this approach could find a lot of academic fraud.
  
  7 replies →
paulpauper 11 hours ago
My take: this seems excessive.
ArXiv doesn't even check the submission closely, so how can they know?
They say "errors, mistakes"
They use an automated system to check if the basic requirements were met, and sometimes papers are flagged for further superficial human review, but there is no way they can possibly do this at scale or check every reference. This would be like trying to do peer review, but for a preprint archive that gets easily 100x more volume than any journal.
Second, there is such a huuuuge gap between publishing on arvix and peer review. I can attest personally that it's not even close. I've gotten probably dozen rejections from peer review and no problems publishing in arxiv math. This is because peer review checks not just for if something is new or correct, but also if it's of "interest to math community," which is inherently subjective, but also makes peer review many magnitudes harder than publishing on arxiv.
Even when a well-known professor in number theory praised the paper when I got an endorsement and a second emailed me and and encouraged me to publish it, it still got rejected 3 times and still waiting.
Being required to publish in a peer reviewed journal will close off arxiv for many researchers for good. It also defeats the point of it being a pre-print.
- rossjudson 8 hours ago
  
  This puts the burden to make sure it's right on the submitter, where it should be. Verification can come at any time after that; the submitter understands the consequences of hallucinated references. Verification can be crowd-sourced (and likely will be).
  Nothing stops someone from putting a PDF on the internet. I'm fine with ArXiv holding a high standard.
  
  2 replies →
- nimih 5 hours ago
  
  > ArXiv doesn't even check the submission closely, so how can they know?
  They can be informed by people who read the papers and check the citations. A zero-tolerance policy provides an incentive to report sloppy papers (namely, that you can be confident something will be done about it), and each time a paper is removed or an author is banned, it incrementally increases the value of the arXiv as a whole.
  > Being required to publish in a peer reviewed journal will close off arxiv for many researchers for good.
  At the end of the day, demanding that people carefully proofread their LLM-generated papers before sharing them on the arXiv seems like a relative low bar to clear, and I sort of question whether it's reasonable to call individuals who find it too onerous "researchers" in the first place.
- helterskelter 11 hours ago
  
  You could at least filter out hallucinated references which simply don't exist pretty trivially, I'd imagine.
  
  2 replies →
- eqvinox 8 hours ago
  
  You don't need to be actively enforcing a rule 100% on everyone. Speed cameras don't cover every stretch of road either.
  It's enough for them to place this policy and enforce it when they become aware of violations. Someone reading the slopped paper (or, here, trying to follow a reference) will notice sooner or later.
  > Being required to publish in a peer reviewed journal will close off arxiv for many researchers for good. It also defeats the point of it being a pre-print.
  You sound like it's impossible for researchers to write papers without slopped references, and inevitable to get hit by this policy.
  
  2 replies →
dataflow 13 hours ago
> This is incredibly good for science.
I disagree. It's just one darn hallucinated citation for heaven's sake, not fraud or something. It doesn't account for the substance or quality of their work at all. A one-year ban seems plenty sufficient for a minor first time mistake like this. People make mistakes and a good fraction of them can learn from those mistakes. There's no need to permanently cripple someone's ability to progress their life or contribute to humanity just because an AI hallucinated a reference one time in their life. That's punitive instead of rehabilitative.
- toast0 12 hours ago
  
  > It's just one darn hallucinated citation for heaven's sake, not fraud or something.
  It is fraud.
  > It doesn't account for the substance or quality of their work at all.
  References are part of the work. If you're making up the references, what else are you making up?
  > People make mistakes and a good fraction of them can learn from those mistakes. There's no need to permanently cripple someone's ability to progress their life or contribute to humanity just because an AI hallucinated a reference one time in their life.
  A one year ban is not permanent. Having a negative consequence for making poor decisions seems like an inducement to learn from the mistake?
  In an ideal world, one would be keeping notes on references used while doing the research that lead to writing the paper. Choosing not to do that is one poor decision.
  Having a positive outlook, if asking an AI to provide references that may have been missed, one should at least verify the references exist and are relevant. Choosing not to do that is also a poor decision, even if one did take notes on references used while researching.
  
  51 replies →
- wrs 13 hours ago
  
  A "mistake" would be a typo in a real citation. A hallucinated citation is evidence of just plain laziness and negligence, which taints the entire submission.
  
  14 replies →
- goolz 12 hours ago
  
  If you cannot be bothered to check your references when writing academic quality papers then you have no place writing them in the first place. The punishment is not chopping off a finger, it is a polite reminder to do the bare minimum.
  
  1 reply →
- vhantz 11 hours ago
  
  What's the difference between a "hallucinated" citation and consciously inserting reference to a non-existent paper and hopping it goes unnoticed? How do we determine which one was done consciously and which was "a minor first time mistake"?
  Your standards are lower than what they would accept at my high-school. Seriously.
  And generally, if you are generating papers with LLMs, let other LLMs read them. Why would we waste human hours considering something that was generated? At this point publish your prompt because that's the actual work you're doing.
- ajkjk 13 hours ago
  
  It's not the kind of mistake that is possible unless you're engaging in fraud anyway.
  
  14 replies →
- patcon 12 hours ago
  
  A citation is where you derived knowledge... If you haven't checked it and you are submitting something that should represent a ton of labour (and which will consume labour to review), you don't understand what you're doing. It is not just crossing T's and dotting I'd.
  Your being set behind is less important than the fact that your publishing is setting everyone else behind.
  Such a banned person is being helped to "step out of the way", and someone more competent will assuredly step forward to consume the limited maintenance labour more thoughtfully
  
  5 replies →
- nkrisc 2 hours ago
  
  It’s easy to avoid this whole issue: write the paper yourself.
- conartist6 12 hours ago
  
  Yes, it is fraud
- Loughla 13 hours ago
  
  Don't use AI? Problem solved?
- themafia 11 hours ago
  
  > There's no need to permanently cripple someone's ability to progress their life or contribute to humanity
  I don't think you need to publish on arXive to contribute meaningfully to humanity.
  > That's punitive instead of rehabilitative.
  Unfortunately science is competitive. Yours is a race to the bottom where the people who can afford the most expensive models and who are least concerned with the truth can publish the most papers and benefit financially and professionally by doing so. This is not a zero sum arena, grant money and opportunities will possibly be rewarded to them, and not to another team who is producing more careful and genuine output.
- mianos 12 hours ago
  
  You are being ironic right?
- simianparrot 6 hours ago
  
  In science, one hallucinated reference can corrupt the entire rest of the work. So you're completely wrong.
  
  1 reply →
- redsocksfan45 1 hour ago
  
  [dead]

noobermin 12 hours ago

Seeing the usual LLM hypers angry replying to this on twitter is such a tell. Just like the comments on the LLM poisoning articles, some people just can't accept that some people don't like LLMs and get upset when you put any amount of hindrance to their rapid acceptance.

tdeck 2 hours ago
It's hard for me to even understand their perspective. Researching references for a published academic paper isn't some incidental busywork task, it's supposed to be a core part of doing research which is the core of the job. If you don't have sympathy for someone who, say, paid a person on Fiverr to cook up a paper rather than writing it themselves and then didn't even bother to check the references, why is using an LLM and not checking any better?
- boccaff 1 hour ago
  
  There is a lot of "throw it against the wall, and if it sticks, write it up" empirical work against benchmarks. It leads to post-hoc rationalization of the work and browser plugins using LLMs to find references for work that is already written. It is a bureaucratic view about "you need a citation for this", where people misunderstand the citation as a checkbox, instead of "you need to substantiate this claim, as I, the reviewer, do not accept this as a fact".
ethin 6 hours ago

It's also hilarious that they complain about this because, from what I've seen, most LLM hypers will talk about something being irrelevant or taken over by AI with no understanding of what that something really is or involves.
lou1306 3 hours ago

> some people don't like LLMs
It's not even that they "don't like LLMs". They just don't like academic fraud! If references were fabricated with a Markov chain it would be just as bad!
lbrito 9 hours ago

Crazy that this is graytexted. So basically HN consensus is that we need to be hyper and accelerate llm adoption everywhere.
Bonkers. At the same time peak hn

imenani 15 hours ago

https://xcancel.com/tdietterich/status/2055000956144935055

JumpCrisscross 14 hours ago
> Our Code of Conduct states that by signing your name as an author of a paper, each author takes full responsibility for all its contents, irrespective of how the contents were generated (Dieterrich, T. G.)
- incognition 5 hours ago
  
  coauthors about to get roasted
  
  2 replies →

mks_shuffle 9 hours ago

While this is certainly a welcome step, I hope there is more work done to fix the underlying problem of easily creating correct BibTeX entries for the cited papers. Citations for any given paper can come from a wide range of journals with various publishers, conferences, and preprints. The same paper can be available from multiple sources with varying details, e.g. arXiv and the conference website. Tools like Zotero have certainly made it significantly easier to extract citations from webpages of publication, but I still find issues with the extracted BibTeX details. While author names and titles are often extracted correctly, I still have to manually ensure that details like publication venue, year, volume number, page number, URL, etc. are extracted correctly and also shown correctly in LaTeX format. Different publications can use different citation styles. This can unfortunately lead to taking shortcuts with AI-generated citation data due to the lack of an easy and unified approach to extract consistent citation data. I am not sure whether hallucinated citations are being generated in the main manuscript or in a separate BibTeX file, so I may be a bit off in my understanding.

gucci-on-fleek 2 hours ago

Fun fact: if an article has a DOI, you can just use curl to get a BibTeX entry. An example using one of my articles:

  $ curl -L "https://doi.org/10.47397/tb/43-1/tb133chernoff-widows" -H 'Accept: application/x-bibtex'
  @article{Chernoff_2022, title={Automatically removing widows and orphans with <tt>lua-widow-control</tt>}, volume={43}, ISSN={0896-3207}, url={http://dx.doi.org/10.47397/tb/43-1/tb133chernoff-widows}, DOI={10.47397/tb/43-1/tb133chernoff-widows}, number={1}, journal={TUGboat}, publisher={TeX Users Group}, author={Chernoff, Max}, year={2022}, pages={28–39} }

This is the exact same method that Zotero uses internally, so this won't ever give you better results, but I still find it kinda neat.

flexagoon 9 hours ago

Note that Zotero also has a free online tool to generate citations in any format or BibTeX files from a URL/DOI/ISBN/...
https://zbib.org/

rgmerk 11 hours ago

Good.

If it’s not worth your time to check the output of your LLM carefully, it’s not worth my time to read it.

djoldman 10 hours ago
Unfortunately, it's probably not worth your time to read 99% of arxiv papers, LLM generated or otherwise.
Ever pick a random one and really dive in?
- gammalost 1 hour ago
  
  Well, yeah, 99% of arXiv papers were not written for me or you. They were written for someone who works in a niche within a niche. That's (in my view) the beauty of research.
- rgmerk 9 hours ago
  
  Agreed. There was already too much human generated slop in academia.
  And I’m not talking about good faith research that didn’t pan out, I mean research that is completely useless for any other purpose other than convincing a casual observer that the authors are doing research.

az226 7 hours ago

Next, for AI papers, a reproducibility requirement. So much code and details are fudged and paper's cannot be reproduced. Ran the training with some other config, or other data, etc. to make their mechanism or intervention seem better.

Foivos 2 hours ago

I just wish to anyone who is against this policy to be forced to review a paper that turns out to be unedited AI slop. Reviewers are experts volunteers who do it for free. It is incredibly frustrating to have spent 4 hours reading a paper where you try your best to make sense of what the authors are trying to prove just to realize that it is hallucinations.

The authors should value the time of the reviewers higher than their own time. So, if you include AI nonsense in your paper, it is insulting.

scirob 2 hours ago

Great, it's so easy to automate checking ref super bad to not check

ElenaDaibunny 8 hours ago

how will they detect hallucinated refs at scale? Manual spot checks? Automated DOI verification? The policy seems right but enforcement is the hard part.

ddosmax556 2 hours ago

Enforcement is secondary and is allowed to take weeks / months / never at all if nobody reads the paper. It's about being able to ban if an issue arrises; not about keeping the database strictly clean.
fc417fc802 8 hours ago

However difficult it might be right now it's only going to get easier. Anyway I don't think proactive enforcement is the point. Rather now they have an official method by which to address incidents that are brought to their attention.

MinimalAction 12 hours ago

There needs be to a careful vetting before such adverse actions. If somebody includes a name and pushed it without express permission, does everyone get the ban? I agree that implemented the right way, this is good.

vasco 9 hours ago
Plus afaik you can add any co-author you want without validation. So you can ban everyone on arxiv with one paper with one sentence.
- lou1306 3 hours ago
  
  As I mentioned in another thread:
  To be a coauthor on a preprint that you have not submitted, you have to actively "claim" it (using a password given to the author who submitted). It's on you to double-check before claiming.
  I surely hope that only "confirmed" coauthors will get the ban, it's only logical.

thatjoeoverthr 3 hours ago

No mercy to brain slugs.

cyclecycle 5 hours ago

This has become such a problem in scholarly publishing that we have a business that provides citation checking https://groundedai.company/ that we've been buidling for a couple of years now

gizajob 2 hours ago

What’s the hallucination rate of your AI?

soraminazuki 2 hours ago

It's not unexpected, but still sad to see so many comments opposing even the smallest step against low-effort fraud in academic publications. Is this what hacker culture has been reduced to in the age of the slop era? Open hostility against science and engineering?

bigfishrunning 15 hours ago

Good; academic literature is in crisis because of all of the slop. Forcing some consequences on easily-detectable hallucinations can only be a good thing

tengwar2 14 hours ago
It's not just AI, though. I did a doctorate in physics about 40 years back, and bad references were a problem back then.
- dualvariable 13 hours ago
  
  Doesn't matter if it is AI hallucinations or entirely human scientific fraud, the problem is the same, and the solution works fine for both cases.
  If you can't validate that your bibliography is full of real articles, you shouldn't get published.
  LLMs have just poured gasoline on the fire.
- lucb1e 13 hours ago
  
  In what way? Surely something like the source not quite saying what was cited, or mixing up citations, rather than inventing them outright?
  
  2 replies →
- wrs 13 hours ago
  
  "Bad", like, you literally just made them up? I hope that would have been a problem.
  
  1 reply →
- noobermin 11 hours ago
  
  Which is why the angry replies on Twitter from AI hype accounts is so funny. You should get penalised for fake references and profanity in your submissions, even if you wrote your slop longhand. I don't know why anyone would have an issue with this policy.
- add-sub-mul-div 13 hours ago
  
  Yes and ffs arrows kill people too but we don't bring that up every time we talk about what to do with guns.
- asdff 12 hours ago
  
  Imagine how bad they are now then.

druub 7 hours ago

What are reasonable alternatives to arXiv? It has become increasinbgly slow. Techrxiv?

squirrelon 13 hours ago

Had a colleague submit a paper with literal AI slop left in the text, got hit with a nasty revision request. Check your drafts before you submit, people. The reviewers will find it.

miki123211 12 hours ago
Also check your LaTeX comments, Arxiv makes those publicly visible!!!
I'm a screen reader user and usually read papers as raw TeX. I've seen everything: slurs, demeaning comments towards reviewers and professors, admissions of fraud, instructions to coauthors to commit further fraud before paper submission to mask the earlier fraud... it's all there. There's far less of it than I would think, definitely <1% of papers, but it's there.
I think it would be useful to run an LLM anti-fraud pass on the TeX source of all new arxiv papers. It wouldn't catch everything, but it would catch some of the dumbest fraudsters.
On the positive side, you can also find stronger claims that didn't survive review, additional explanations that didn't make the cut due to the conference's page limit, as well as experimental results that the authors felt weren't really worth including. Those need to be approached with an abundance of caution, but are genuinely useful sometimes.
- andrepd 12 hours ago
  
  https://xcancel.com/leaksph
- cozzyd 9 hours ago
  
  That's why my forarxiv make target includes a run through latexpand
SchemaLoad 12 hours ago

Sad the suggestion here is to just disguise the slop to make it harder for reviewers to spot rather than not submitting slop to begin with.

az226 7 hours ago

Hurray!

jimmygrapes 8 hours ago

As of yet no comments here seem to address the "reputable" condition. Reputable review is based on what criteria?

nullc 11 hours ago

It's been pretty eye opening watching Craig Wright (of bitcoin fakery fame) flooding out LLM generated 'academic' papers and even having some of them accepted.

He's toast if SSRN were to adopt a similar policy.

jszymborski 11 hours ago

Should be more harsh in my opinion.

Ozzie-D 2 hours ago

[flagged]

jeremie_strand 9 hours ago

[flagged]

hyunwoo222 10 hours ago

[flagged]

random3 14 hours ago

It seems a good idea to ban cheating, but how hard is it, especially in new reasoning/agents contexts to validate references?

The deeper question is whether legitimate AI generated results are allowed or not? Test - In the extreme - think proof of Riemann Hypothesis autonomously generated (end to end) formally proven - is it allowed or not?

Ifkaluva 14 hours ago

This is not about banning cheating, it’s about banning inaccurate information.
Retric 14 hours ago

You don’t need to solve everything, catching a few thousand non existent citations with such a policy is on its own a net benefit.
pointlessone 14 hours ago

It is allowed as long as it’s verified.
The thread specifically points out that if authors can’t be arsed to simply proofread their text the rest can not be trusted either.
It’s a simple heuristic against low quality submissions, not an anti-ai measure.
lionkor 13 hours ago

If you use AI correctly, nobody should be able to tell that it was used at all.
david_shi 10 hours ago

In that case, you would just not do a reference. End to end autonomous science might have fewer concrete citations as the contributing knowledge is just the sum of the training data of the model.
beloch 13 hours ago

There already exists multiple tools for automatically verifying references. This measure will likely only filter out the laziest and most incompetent of AI slop submissions. It's a very modest raising of the bar, but comes at zero cost to honest researchers.
I expect arXiv will still have problems with slop submissions but, at least, their references should actually exist going forward.
llm_nerd 13 hours ago

It isn't "cheating" they're concerned with, it's sloppiness. This dictum isn't some sort of AI ban, but instead simply that if there is evidence that it was so low effort that the work includes such blatant problems, it's just adding noise.
pinkmuffinere 13 hours ago
> think proof of Riemann Hypothesis autonomously generated (end to end) formally proven - is it allowed or not?
Sorry to be rude, but this seems like a dumb question. I want science to progress. A primary purpose of these journals is to progress science. A full proof of the Riemann Hypothesis progresses science. I don't care how it was produced, if Hitler is coauthor, etc, I just care that it is correct. Whether the authors should be rewarded for whatever methods they used can be a separate question.
- kingstnap 13 hours ago
  
  Terence Tao had a nice talk from the Future of Mathematics conference posted yesterday [0] that shapes a lot of my own feelings on this matter.
  The short of it is he argues how first to correctness shouldn't be the only goal / isn't a great optimisation incentive. Presentation and digestibility of correct results is a missing 1/3 when you've finished generation and verification. I completely agree with him. You don't just need an AI generated proof of the Reimann Hypothesis. You would really like it to be intentional and structured for others to understand.
  A really beautiful quote I learned of in the talk is this:
  > "We are not trying to meet some abstract production quota of definitions, theorems, and proofs. The measure of our success is whether what we do enables people to understand and think more clearly and effectively about math." - William Thurston
  [0] https://www.youtube.com/watch?v=Uc2zt198U_U
  
  2 replies →