Comment by ctoth
1 day ago
The innumeracy is load-bearing for the entire media ecosystem. If readers could do basic proportional reasoning, half of health journalism and most tech panic coverage would collapse overnight.
GPTZero of course knows this. "100 hallucinations across 53 papers at prestigious conference" hits different than "0.07% of citations had issues, compared to unknown baseline, in papers whose actual findings remain valid."
I’m not sure that’s fair in this context.
In the past, a single paper with questionable or falsified results at a top tier conference was big news.
Something that casts doubt on the validity of 53 papers at a top AI conference is at least notable.
> whose actual findings remain valid
Remain valid according to who? The same group that missed hundreds of hallucinated citations?
Which of these papers had falsified results and not bad citations?
What is the base rate of bad citations pre-AI?
And finally yes. Peer review does not mean clicking every link in the footnotes to make sure the original paper didn't mislink, though I'm sure after this bruhaha this too will be automated.
> Peer review does not mean clicking every link in the footnotes
It wasn't just broken links, but citing authors like "lastname, firstname" and made up titles.
I have done peer reviews for a (non-AI) CS conference and did at least skim the citations. For papers related to my domain, I was familiar with most of the citations already, and looked into any that looked odd.
Being familiar with the state of the art is, in theory, what qualifies you to do peer reviews.
> "0.07% of citations had issues
Nope, you are getting this part wrong. On purpose or by accident? Because it's pretty clear if you read the article they are not counting all citations that simply had issues. See "Defining Hallucinated Citations".