Comment by random3

17 hours ago

It seems a good idea to ban cheating, but how hard is it, especially in new reasoning/agents contexts to validate references?

The deeper question is whether legitimate AI generated results are allowed or not? Test - In the extreme - think proof of Riemann Hypothesis autonomously generated (end to end) formally proven - is it allowed or not?

You don’t need to solve everything, catching a few thousand non existent citations with such a policy is on its own a net benefit.

It is allowed as long as it’s verified.

The thread specifically points out that if authors can’t be arsed to simply proofread their text the rest can not be trusted either.

It’s a simple heuristic against low quality submissions, not an anti-ai measure.

In that case, you would just not do a reference. End to end autonomous science might have fewer concrete citations as the contributing knowledge is just the sum of the training data of the model.

There already exists multiple tools for automatically verifying references. This measure will likely only filter out the laziest and most incompetent of AI slop submissions. It's a very modest raising of the bar, but comes at zero cost to honest researchers.

I expect arXiv will still have problems with slop submissions but, at least, their references should actually exist going forward.

It isn't "cheating" they're concerned with, it's sloppiness. This dictum isn't some sort of AI ban, but instead simply that if there is evidence that it was so low effort that the work includes such blatant problems, it's just adding noise.

> think proof of Riemann Hypothesis autonomously generated (end to end) formally proven - is it allowed or not?

Sorry to be rude, but this seems like a dumb question. I want science to progress. A primary purpose of these journals is to progress science. A full proof of the Riemann Hypothesis progresses science. I don't care how it was produced, if Hitler is coauthor, etc, I just care that it is correct. Whether the authors should be rewarded for whatever methods they used can be a separate question.

  • Terence Tao had a nice talk from the Future of Mathematics conference posted yesterday [0] that shapes a lot of my own feelings on this matter.

    The short of it is he argues how first to correctness shouldn't be the only goal / isn't a great optimisation incentive. Presentation and digestibility of correct results is a missing 1/3 when you've finished generation and verification. I completely agree with him. You don't just need an AI generated proof of the Reimann Hypothesis. You would really like it to be intentional and structured for others to understand.

    A really beautiful quote I learned of in the talk is this:

    > "We are not trying to meet some abstract production quota of definitions, theorems, and proofs. The measure of our success is whether what we do enables people to understand and think more clearly and effectively about math." - William Thurston

    [0] https://www.youtube.com/watch?v=Uc2zt198U_U

    • Ya, I think this totally makes sense. Just to be clear though, I don’t think we’re actually disagreeing. A proof of the Riemann hypothesis that’s obtuse and basically unreadable is a great step on the path to a proof that is enlightening and clear. If ai provides correct-but-annoying results, I’m confident humans can still learn benefit from that marginal result.