Comment by xeeeeeeeeeeenu
20 hours ago
> no prior solutions found.
This is no longer true, a prior solution has just been found[1], so the LLM proof has been moved to the Section 2 of Terence Tao's wiki[2].
[1] - https://www.erdosproblems.com/forum/thread/281#post-3325
[2] - https://github.com/teorth/erdosproblems/wiki/AI-contribution...
Interesting that in Terrance Tao's words: "though the new proof is still rather different from the literature proof)"
And even odder that the proof was by Erdos himself and yet he listed it as an open problem!
Maybe it was in the training set.
I think that was Tao's point, that the new proof was not just read out of the training set.
44 replies →
It looks like these models work pretty well as natural language search engines and at connecting together dots of disparate things humans haven't done.
They're finding them very effective at literature search, and at autoformalization of human-written proofs.
Pretty soon, this is going to mean the entire historical math literature will be formalized (or, in some cases, found to be in error). Consider the implications of that for training theorem provers.
I think "pretty soon" is a serious overstatement. This does not take into account the difficulty in formalizing definitions and theorem statements. This cannot be done autonomously (or, it can, but there will be serious errors) since there is no way to formalize the "text to lean" process.
What's more, there's almost surely going to turn out to be a large amount of human generated mathematics that's "basically" correct, in the sense that there exists a formal proof that morally fits the arc of the human proof, but there's informal/vague reasoning used (e.g. diagram arguments, etc) that are hard to really formalize, but an expert can use consistently without making a mistake. This will take a long time to formalize, and I expect will require a large amount of human and AI effort.
1 reply →
Every time this topic comes up people compare the LLM to a search engine of some kind.
But as far as we know, the proof it wrote is original. Tao himself noted that it’s very different from the other proof (which was only found now).
That’s so far removed from a “search engine” that the term is essentially nonsense in this context.
Hassabis put forth a nice taxonomy of innovation: interpolation, extrapolation, and paradigm shifts.
AI is currently great at interpolation, and in some fields (like biology) there seems to be low-hanging fruit for this kind of connect-the-dots exercise. A human would still be considered smart for connecting these dots IMO.
AI clearly struggles with extrapolation, at least if the new datum is fully outside the training set.
And we will have AGI (if not ASI) if/when AI systems can reliably form new paradigms. It’s a high bar.
This illustrates how unimportant this problem is. A prior solution did exist, but apparently nobody knew because people didn't really care about it. If progress can be had by simply searching for old solutions in the literature, then that's good evidence the supposed progress is imaginary. And this is not the first time this has happened with an Erdős problem.
A lot of pure mathematics seems to consist in solving neat logic puzzles without any intrinsic importance. Recreational puzzles for very intelligent people. Or LLMs.
It's hard to predict which maths result from 100 years ago surfaces in say quantum mechanics or cryptography.
The likelihood for that is vanishingly low, though, for any given math result.
> "intrinsic importance"
"Intrinsic" in contexts like this is a word for people who are projecting what they consider important onto the world. You can't define it in any meaningful way that's not entirely subjective.
It shows that a 'llm' can now work on issues like this today and tomorrow it can do even more.
Don't be so ignorant. A few years ago NO ONE could have come up with something so generic as an LLM which will help you to solve this kind of problems and also create text adventures and java code.
The goal posts are strapped to skateboards these days, and the WD40 is applied to the wheels generously.
5 replies →
You can just wait and verify instead of the publishing, redacting cycles of the last year. It's embarrassing.
There is still enormous value in cleaning up the long tail of somewhat important stuff. One of the great benefits of Claude Code to me is that smaller issues no longer rot in backlogs, but can be at least attempted immediately.
The difference is that Claude Code actually solves practical problems, but pure (as opposed to applied) mathematics doesn't. Moreover, a lot of pure mathematics seems to be not just useless, but also without intrinsic epistemic value, unlike science. See https://news.ycombinator.com/item?id=46510353
23 replies →
[flagged]
Pity that HN's ability to detect sarcasm is as robust as that of a sentiment analysis model using keyword-matching.
The problem is more that it's an LLM-generated comment that's about 20x as long as it needed to be to get the point across.
8 replies →
That’s just the internet. Detecting sarcasm requires a lot of context external to the content of any text. In person some of that is mitigated by intonation, facial expressions, etc. Typically it also requires that the the reader is a native speaker of the language or at least extremely proficient.
I firmly believe @threethirtytwo’s reply was not produced by an LLM
regardless of if this text was written by an LLM or a human, it is still slop,with a human behind it just trying to wind people up . If there is a valid point to be made , it should be made, briefly.
1 reply →
Are you expecting people who can't detect self-dellusions to be able to detect sarcasm, or are you just being cruel?
> This is a relief, honestly. A prior solution exists now, which means the model didn’t solve anything at all. It just regurgitated it from the internet, which we can retroactively assume contained the solution in spirit, if not in any searchable or known form. Mystery resolved.
Vs
> Interesting that in Terrance Tao's words: "though the new proof is still rather different from the literature proof)"
I suspect this is AI generated, but it’s quite high quality, and doesn’t have any of the telltale signs that most AI generated content does. How did you generate this? It’s great.
Their comments are full of "it's not x, it's y" over and over. Short pithy sentences. I'm quite confident it's AI written, maybe with a more detailed prompt than the average
I guess this is the end of the human internet
6 replies →
Your intuition on AI is out of date by about 6 months. Those telltale signs no longer exist.
It wasn't AI generated. But if it was, there is currently no way for anyone to tell the difference.
9 replies →
(edit: removed duplicate comment from above, not sure how that happened)
2 replies →
It's bizarre. The same account was previously arguing in favor of emergent reasoning abilities in another thread ( https://news.ycombinator.com/item?id=46453084 ) -- I voted it up, in fact! Turing test failed, I guess.
(edit: fixed link)
3 replies →
Why not plan for a future where a lot of non-trivial tasks are automated instead of living on the edge with all this anxiety?
[flagged]
5 replies →