Comment by godelski
14 hours ago
> In an ideal world, one would be keeping notes on references used
In a far less than ideal world authors are referencing papers they've at least read the title and abstract of. In an ideal world, authors would be only referencing works they have read in their entirety. I don't think we need to live in the ideal world[0], but let's also not pretend the ideal world is even remotely out of reach. Let's also be honest that in the current setting a lot of citations are being used to encourage a work be accepted more than they are being used because of their utility to the paper. The average ML paper now is 8 pages and has >50 citations. That's crazy
[0] References can be entire textbooks, which is potentially too high of a bar
Even as a human, you can still fuck up references.
I submitted a paper with a reference author as Elisio because I couldn’t read my own handwriting. After submitting, I double checked all the references through an LLM. It pointed out that their name was actually Enrique. Yes, you should probably double check your references before submitting, not after.
Point is, I didn’t even trust the LLM at first. But after verifying the mistake, I was embarrassed af. I resubmitted with the fixes before it went live, but ultimately, what’s the difference between “mistake” and “hallucination”?
Sounds like you could use a tool like Zotero.
With proper bibliography management tools, everything (that has one) is centered around the DOI.
In fact, if a DOI is present, it's trivial to verify authors, title, venue, year, pages etc.
Of course, some older and more obscure papers won't have a DOI, but the vast majority of research work has.
Given their examples and examples I've seem Thomas talk about in the past, I doubt a typo like that would be grounds for the ban.
Perhaps the issue is that people aren't logged in or using xcancel so missing part of the tweet thread. Here's an important line
> If a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can't trust anything in the paper.
Followed by
> Examples of incontrovertible evidence: hallucinated references, meta-comments from the LLM ("here is a 200 word summary; would you like me to make any changes?"; "the data in this table is illustrative, fill it in with the real numbers from your experiments")
I wouldn't look at your case and read that as "incontrovertible evidence". They are looking for the absolutely brain dead, no one at the wheel, type of errors. They're looking for things like your paper saying "As an AI language model". Which, there will be real papers with that exact phrase, but it should get flagged, not auto banned
I assume they won’t ban anyone automatically without a way to object. Using your example, i wouldn’t assume they would enforce the ban if you object and explain your typo and if the corrected citation actually says what you cited. Mistakes like these are explainable a completely hallucinated citation is usually not.