Comment by raincole

23 days ago

I know many people have negative opinions about this.

I'd also like to share what I saw. Since GPT-4o became a thing, everyone who submits academic papers I know in my non-english speaking country (N > 5) has been writing papers in our native language and translating them with GPT-4o exclusively. It has been the norm for quite a while. If hallucination is such a serious problem it has been so for one and half a year.

Translation is something Large Language Models are inherently pretty good at, without controversy, even though the output still should be independently verified. It's a language task and they are language models.

  • Are they good at translating scientific jargon specific to a niche within a field? I have no doubt LLMs are excellent at translating well-trodden patterns; I'm a bit suspicious otherwise..

    • In my experience of using it to translate ML work between English->Spanish|Galician, it seems to literally translate jargon too eagerly, to the point that I have to tell it to maintain specific terms in English to avoid it sounding too weird (for most modern ML jargon there really isn't a Spanish translation).

    • It seems to me that jargon would tend to be defined in one language and minimally adapted in other languages. So I’d not sure that would be much of a concern.

      1 reply →

I've heard that now that AI conferences are starting to check for hallucinated references, rejection rates are going up significantly. See also the Neurips hallucinated references kerfuffle [1]

[1]: https://statmodeling.stat.columbia.edu/2026/01/26/machine-le...

  • Honestly, hallucinated references should simply get the submitter banned from ever applying again. Anyone submitting papers or anything with hallucinated references shall be publicly shamed. The problem isn't only the LLMs hallucinating, it's lazy and immoral humans who don't bother to check the output either, wasting everyone's time and corroding public trust in science and research.

    • I fully agree. Not reading your own references should be grounds for banning, but that's impossible to check. Hallucinated references cannot be read, so by definition,they should get people banned.

      1 reply →

  • Yeah that's not going to work for long. You can draw a line in 2023, and say "Every paper before this isn't AI". But in the future, you're going to have AI generated papers citing other AI slop papers that slipped through the cracks, because of the cost of doing reseach vs the cost of generating AI slop, the AI slop papers will start to outcompete the real research papers.

    • How is this different from flat earth / creationist papers citing other flat earth / creationist papers ?

    • >the cost of doing reseach vs the cost of generating

      >slop papers will start to outcompete the real research papers.

      This started to rear its ugly head when electric typewriters got more affordable.

      Sometimes all it takes is faster horses and you're off to the races :\

It's quite a safe case if you maintain provenance because there is a ground truth to compare to, namely the untranslated paper.