← Back to context

Comment by Aurornis

1 day ago

> Even if 1.1% of the papers have one or more incorrect references due to the use of LLMs, the content of the papers themselves are not necessarily invalidated.

This statement isn’t wrong, as the rest of the paper could still be correct.

However, when I see a blatant falsification somewhere in a paper I’m immediately suspicious of everything else. Authors who take lazy shortcuts when convenient usually don’t just do it once, they do it wherever they think they can get away with it. It’s a slippery slope from letting an LLM handle citations to letting the LLM write things for you to letting the LLM interpret the data. The latter opens the door to hallucinated results and statistics, as anyone who has experimented with LLMs for data analysis will discover eventually.

Yep, it's a slippery slope. No one in their right mind would have tried to use GPT 2.0 for writing a part of their paper. But hallucination-error-rate kept decreasing. How do you think, is there acceptable hallucination-error-rate greater than 0?