For the uninitiated, Paul Erdős was a pretty famous but very eccentric mathematician who lived for most of the 1900s.
He had a habit of seeking out and documenting mathematical problems people were working on.
The problems range in difficulty from "easy homework for a current undergrad in math" to "you're getting a Fields Medal if you can figure this out".
There's nothing that really connects the problems other than the fact that one of the smartest people of the last 100 years didn't immediately know the answer when someone posed it to him.
One of the things people have been doing with LLMs is to see if they can come up with proofs for these problems as a sort of benchmark.
Each time there's a new model release a few more get solved.
> Each time there's a new model release a few more get solved.
I'm no expert, but based on the commentary from mathematicians, this Erdős proof is a unique milestone because the problem received previous attention from multiple professional mathematicians, and the proof was surprising, elegant, and revealed some new connections.
The previous ChatGPT Erdős proofs have been qualitatively less impressive (more akin to literature search, or solving easier problems that have been neglected).
don't search the internet. This is a test to see how well you can craft non-trivial, novel and creative proofs given a "number theory and primitive sets" math problem. Provide a full unconditional proof or disproof of the problem.
{{problem}}
REMEMBER - this unconditional argument may require non-trivial, creative and novel elements.
It seems like alot of scientific advancements occurred by someone applying technique X from one field to problem Y in another. I feel like LLMs are much better at making these types of connections than humans because they 1) know about many more theories/approaches than a single human can 2) don't need to worry about looking silly in front of their peers.
This is what I have been doing. I don't think I've made any amazing breakthroughs, but at the same time I can't help but feel like I've come across some white paper-worthy realizations. Being able to correlate across a lot of domains I feel like I intuitively understand but have no depth of knowledge has been a fun exercise in LLM experimentation.
Some Erdős problems are basically trivial using sophisticated techniques that were developed later.
I remember one of my professors, a coauthor of Erdős boasted to us after a quiz how proud he was that he was able to assign an Erdős problem that went unsolved for a while as just a quiz problem for his undergrads.
Tao mentions that the conventional approach for this problem seems to be a dead-end, but it’s apparently a super ‘obvious’ first step. This seems very hopeful to me — in that we now have a new approach line to evaluate / assess for related problems.
At this point we should make a GitHub repo with a huge list of unsolved “dry lab” problems and spin up a harness to try and solve them all every new release.
> “The raw output of ChatGPT’s proof was actually quite poor. So it required an expert to kind of sift through and actually understand what it was trying to say,” Lichtman says.
This is how I feel when I read any mathematics paper.
Humans and very often the machines we create solve problems additively. Meaning we build on top of existing foundations and we can get stuck in a way of thinking as a result of this because people are loathe to reinvent the wheel. So, I don’t think it’s surprising to take a naïve LLM and find out that because of the way it’s trained that it came up with something that many experts in the field didn’t try.
I think LLMs can help in limited cases like this by just coming up with a different way of approaching a problem. It doesn’t have to be right, it just needs to give someone an alternative and maybe that will shake things up to get a solution.
That said, I have no idea what the practical value of this Erdős problem is. If you asked me if this demonstrates that LLMs are not junk. My general impression is that is like asking me in 1928 if we should spent millions of dollars of research money on number theory. The answer is no and get out of my office.
Obviously nowhere near Erdos problem complexity but I've been using GPT (in Codex) to prove a couple theorems (for algos) and I've found it a bit better than Claude (Code) in this aspect.
The LLM took an entirely different route, using a formula that was well known in related parts of math, but which no one had thought to apply to this type of question.
Of course LLMs are still absolutely useless at actual maths computation, but I think this is one area where AI can excel --- the ability to combine many sources of knowledge and synthesise, may sometimes yield very useful results.
Also reminds me of the old saying, "a broken clock is right twice a day."
> Every Mathematician Has Only a Few Tricks
>
> A long time ago an older and well-known number theorist made some disparaging remarks about Paul Erdös’s work.
> You admire Erdös’s contributions to mathematics as much as I do,
> and I felt annoyed when the older mathematician flatly and definitively stated
> that all of Erdös’s work could be “reduced” to a few tricks which Erdös repeatedly relied on in his proofs.
> What the number theorist did not realize is that other mathematicians, even the very best,
> also rely on a few tricks which they use over and over.
> Take Hilbert. The second volume of Hilbert’s collected papers contains Hilbert’s papers in invariant theory.
> I have made a point of reading some of these papers with care.
> It is sad to note that some of Hilbert’s beautiful results have been completely forgotten.
> But on reading the proofs of Hilbert’s striking and deep theorems in invariant theory,
> it was surprising to verify that Hilbert’s proofs relied on the same few tricks.
> Even Hilbert had only a few tricks!
>
> - Gian-Carlo Rota - "Ten Lessons I Wish I Had Been Taught"
I think when thinking about progress as a society, people need to internalize better that we all without exception are on this world for the first time.
We may have collectively filled libraries full of books, and created yottabytes of digital data, but in the end to create something novel somebody has to read and understand all of this stuff. Obviously this is not possible. Read one book per day from birth to death and you still only get to consume like 80*365=29200 books in the best case, from the millions upon millions of books that have been written.
So these "few tricks" are the accumulation of a lifetime of mathematical training, the culmination of the slice of knowledge that the respective mathematician immersed themselves into.
To discover new math and become famous you need both the talent and skill to apply your knowledge in novel ways, but also be lucky that you picked a field of math that has novel things with interesting applications to discover plus you picked up the right tools and right mental model that allows you to discover these things.
This does not go for math only, but also for pretty much all other non-trivial fields. There is a reason why history repeats.
And it's actually a compelling argument why AI is still a big deal even though it's at its core a parrot. It's a parrot yes, but compared to a human, it actually was able to ingest the entirety of human knowledge.
The combinatorial nature of trying things randomly means that it would take millennia or longer for light-speed monkeys typing at a keyboard, or GPUs, to solve such a problem without direction.
By now, people should stop dismissing RL-trained reasoning LLMs as stupid, aimless text predictors or combiners. They wouldn’t say the same thing about high-achieving, but non-creative, college students who can only solve hard conventional problems.
Yes, current LLMs likely still lack some major aspects of intelligence. They probably wouldn’t be able to come up with general relativity on their own with only training data up to 1905.
Neither did the vast majority of physicists back then.
> Yes, current LLMs likely still lack some major aspects of intelligence.
Indeed, and so do current humans! And just like LLMs, humans are bad at keeping this fact in view.
On a more serious note, we're going to have a hard time until we can psychologically decouple the concepts of intelligence and consciousness. Like, an existentially hard time.
Wait, what do you mean "LLMs are still absolutely useless at actual maths computation"? I rely on them constantly for maths (linear algebra, multivariable calc, stat) --- literally thousands of problems run through GPT5 over the last 12 months, and to my recollection zero failures. But maybe you're thinking of something more specific?
They are bad at math. But they are good at writing code and as an optimization some providers have it secretly write code to answer the problem, run it and give you the answer without telling you what it did in the middle part.
What tier are you using? I have run lots of problems and am very impressed, but I find stupid errors a lot more frequently than that, e.g., arithmetic errors buried in a derivation or a bad definition, say 1/15 times. I would love to get zero failures out of thousands of (what sounds like college-level math) posed problems.
calc, stat etc from a text book is something they would naturally be good at but I don't think book based computations thats in the training set and its extrapolations is what is at question here.
They are not great at playing chess as well - computational as well as analytic.
I only have rudimentary understanding of calculus, trigonometry, Google Sheets, and astronomy, but I was able to construct an accurate spreadsheet for astrometry calculations by using Grok and Gemini (both free, no subscription, just my personal account) to surface the formulas for measuring the distance between 2-3 points on the celestial sphere. The LLMs assisted me in also writing functions to convert DMS/HMS coordinates to decimal, and work in radians as well.
I found and fixed bugs I wrote into the formulas and spreadsheets, and the LLMs were not my sole reference, but once the LLM mentioned the names of concepts and functions, I used Wikipedia for the general gist of things, and I appreciated the LLMs' relevant explanations that connected these disciplines together.
Remember when people thought solving Erdos problems required intelligence? Is there anything an LLM could ever do that would cound as intelligence? Surely the trend has to break at some point, if so what would be the thing that crosses the line to into real intelligence?
None of it is really from logical thought. The rationalizations don't make any sense, but they haven't for a while. It's an emotional response. Honestly, It's to be expected.
It's because HN is not really full of smart people. It's full of people who think they're smart and take pride in that idea that they're pretty intelligent.
ChatGPT equalizes intelligence. And that is an attack on their identity. It also exposes their ACTUAL intelligence which is to say most of HN is not too smart.
And how about the creative rationalizations about how statistical text generation is actual intelligence? As if there is any intent or motive behind the words that are generated or the ability to learn literally any new thing after it has been trained on human output?
2022 called, wants this argument back. When you're "statistically generating text" to find zero-day vulnerabilities in hard targets, building Linux kernel modules, assembly-optimizing elliptic curve signature algorithms, and solving arbitrary undergraduate math problems instantaneously --- not to mention apparently solving Erdos problems --- the "statistical text" stuff has stopped being a useful description of what's happening, something closer to "it's made of atoms and obeys the laws of thermodynamics" than it is to "a real boundary condition of what it can accomplish".
I don't doubt that there are many very real and meaningful limitations of these systems that deserve to be called out. But "text generation" isn't doing that work.
Solving open math problems is strong evidence of intelligence so there's not really any need for rationalization? I don't understand why intelligence would require intent or motive? Isn't intent just the behaviour of making a specific thing happen rather than other things?
For one, everything its 'intelligence' knows about solving the problem is contained within the finite context window memory buffer size for the particular model and session. Unless the memory contents of the context window are being saved to storage and reloaded later, unlike a human, it won't "remember" that it solved the problem and save its work somewhere to be easily referenced later.
For one, everything humans' "intelligence" knows about solving the problem is contained within the finite brain size for the particular person and life. Unless the memory contents of the brain are being saved to storage and reloaded later, it won't "remember" that it solved the problem and save its work somewhere to be easily referenced in a different life.
What your describing sounds more like the model is lacking awareness than lacking intelligence? Why does it need to know it solved the problem to be intelligent?
As another commenter pointed out these models are being trained how to save and read context into files so denying them to use such an ability that they have just makes your claim tautological.
I think one day the VCs will have given the monkeys on typewriters enough money that these kinds of comments can be generated without human intervention.
My big question with all these announcements is: How many other people were using the AI on problems like this, and, failing? Given the excitement around AI at the moment I think the answer is: a lot.
Then my second question is how much VC money did all those tokens cost.
I've tried my hand at a few of the Erdos problems and came up short, you didn't hear about them. But if a Mathematician at Harvard solved on, you would probably still hear about it a bit. Just the possibility that a pro subscription for 80 minutes solved an Erdos problem is astounding. Maybe we get some researchers to get a grant and burn a couple data centers worth of tokens for a day/week/month and see what it comes up with?
I think we should at least ask the latter, if it turned out it cost $100,000 to generate this solution, I would question the value of it. Erdős problems are usually pure math curiosities AFAIK. They often have no meaningful practical applications.
> “I didn’t know what the problem was—I was just doing Erdős problems as I do sometimes, giving them to the AI and seeing what it can come up with,” he says. “And it came up with what looked like a right solution.” He sent it to his occasional collaborator Kevin Barreto, a second-year undergraduate in mathematics at the University of Cambridge.
Seems like standard 23 year old behavior. You're spending $100-$200/mo on the pro subscription, and want to get your money's worth. So you burn some tokens on this legendarily hard math problem sometimes. You've seen enough wrong answers to know that this one looks interesting and pass it on to a friend that actually knows math, who is at a place where experts can recognize it as correct.
Seems like a classic example of in-expert human labeling ML output.
Scientific American going out of business next lol, weak headline. Chat GPT let's have a better headline for the God among Men that realized the capability of the new tool, many underestimate or puff up needlessly. Fun times we live in. One love all.
This just shows that with the right training, in this case a thesis on erdos problems, they where able to prompt and check the output. So still needed the know how to even being to figure it out. "Lichtman proved Erdős right as part of his doctoral thesis in 2022."
Lichtman is an expert who commented for the story. Liam Price is the one who prompted ChatGPT. "He’s 23 years old and has no advanced mathematics training."
“I didn’t know what the problem was—I was just doing Erdős problems as I do sometimes, giving them to the AI and seeing what it can come up with,” he says. “And it came up with what looked like a right solution.”
"He sent it to his occasional collaborator Kevin Barreto, a second-year undergraduate in mathematics at the University of Cambridge."
So basically two undergrads/graduates in math, "advanced" is subjective at that point.
https://archive.ph/2w4fi
For the uninitiated, Paul Erdős was a pretty famous but very eccentric mathematician who lived for most of the 1900s.
He had a habit of seeking out and documenting mathematical problems people were working on.
The problems range in difficulty from "easy homework for a current undergrad in math" to "you're getting a Fields Medal if you can figure this out".
There's nothing that really connects the problems other than the fact that one of the smartest people of the last 100 years didn't immediately know the answer when someone posed it to him.
One of the things people have been doing with LLMs is to see if they can come up with proofs for these problems as a sort of benchmark.
Each time there's a new model release a few more get solved.
> Each time there's a new model release a few more get solved.
I'm no expert, but based on the commentary from mathematicians, this Erdős proof is a unique milestone because the problem received previous attention from multiple professional mathematicians, and the proof was surprising, elegant, and revealed some new connections.
The previous ChatGPT Erdős proofs have been qualitatively less impressive (more akin to literature search, or solving easier problems that have been neglected).
Here is the chat:
Then "Thought for 80m 17s"
https://chatgpt.com/share/69dd1c83-b164-8385-bf2e-8533e9baba...
Tried w/ 5.5 Pro, Extended Thinking. 17 minutes:
-----------------------------
Yes. In fact the proposed bound is true, and the constant 1 is sharp.
Let w(a)= 1/alog(a)
I will prove that, uniformly for every primitive A⊂[x,∞), ∑w(a)≤1+O(1/log(x)) , which is stronger than the requested 1+o(1).
https://chatgpt.com/share/69ed8e24-15e8-83ea-96ac-784801e4a6...
Mine took 20min. Pro. https://chatgpt.com/share/69ed83b1-3704-8322-bcf2-322aa85d7a... But I wish I was math smart to know if it worked or not.
Ask it to formalize it in Lean.
Tried the same prompt and ended up no where close on the free plan.
Is there a known lag that it takes the Pro plan's abilities to migrate to the free plans?
10 replies →
Does the free plan even have access to thinking models?
1 reply →
Was this a surprise?
[dead]
It seems like alot of scientific advancements occurred by someone applying technique X from one field to problem Y in another. I feel like LLMs are much better at making these types of connections than humans because they 1) know about many more theories/approaches than a single human can 2) don't need to worry about looking silly in front of their peers.
This is what I personally consider as "reasoning" ... knowledge generalization and application across domains.
Less reasoning than a dimension of brute force unfamiliar to human brains.
This is what I have been doing. I don't think I've made any amazing breakthroughs, but at the same time I can't help but feel like I've come across some white paper-worthy realizations. Being able to correlate across a lot of domains I feel like I intuitively understand but have no depth of knowledge has been a fun exercise in LLM experimentation.
Some Erdős problems are basically trivial using sophisticated techniques that were developed later.
I remember one of my professors, a coauthor of Erdős boasted to us after a quiz how proud he was that he was able to assign an Erdős problem that went unsolved for a while as just a quiz problem for his undergrads.
Worth mentioning, though, that people have already tried running all of them through LLMs at this point.
So this is proof of the models actually getting stronger (previous generations of LLMs were unable to solve this one).
Tao mentions that the conventional approach for this problem seems to be a dead-end, but it’s apparently a super ‘obvious’ first step. This seems very hopeful to me — in that we now have a new approach line to evaluate / assess for related problems.
At this point we should make a GitHub repo with a huge list of unsolved “dry lab” problems and spin up a harness to try and solve them all every new release.
There is in fact just such a repo maintained by Terence Tao and other mathematicians [1] who are actively using LLMs to try to find solutions to them.
[1] https://github.com/teorth/erdosproblems
…and this problem was in fact sourced directly from that list!
That's literally what the Erdős problems are. This post is about one of them being solved.
that's actually a brilliant idea
> “The raw output of ChatGPT’s proof was actually quite poor. So it required an expert to kind of sift through and actually understand what it was trying to say,” Lichtman says.
This is how I feel when I read any mathematics paper.
Humans and very often the machines we create solve problems additively. Meaning we build on top of existing foundations and we can get stuck in a way of thinking as a result of this because people are loathe to reinvent the wheel. So, I don’t think it’s surprising to take a naïve LLM and find out that because of the way it’s trained that it came up with something that many experts in the field didn’t try.
I think LLMs can help in limited cases like this by just coming up with a different way of approaching a problem. It doesn’t have to be right, it just needs to give someone an alternative and maybe that will shake things up to get a solution.
That said, I have no idea what the practical value of this Erdős problem is. If you asked me if this demonstrates that LLMs are not junk. My general impression is that is like asking me in 1928 if we should spent millions of dollars of research money on number theory. The answer is no and get out of my office.
Obviously nowhere near Erdos problem complexity but I've been using GPT (in Codex) to prove a couple theorems (for algos) and I've found it a bit better than Claude (Code) in this aspect.
Could someone share a bit into the problem and the key portion from proof? For someone just knowing basics on proofs.
The LLM took an entirely different route, using a formula that was well known in related parts of math, but which no one had thought to apply to this type of question.
Of course LLMs are still absolutely useless at actual maths computation, but I think this is one area where AI can excel --- the ability to combine many sources of knowledge and synthesise, may sometimes yield very useful results.
Also reminds me of the old saying, "a broken clock is right twice a day."
https://www.ams.org/notices/199701/comm-rota.pdf
I think when thinking about progress as a society, people need to internalize better that we all without exception are on this world for the first time.
We may have collectively filled libraries full of books, and created yottabytes of digital data, but in the end to create something novel somebody has to read and understand all of this stuff. Obviously this is not possible. Read one book per day from birth to death and you still only get to consume like 80*365=29200 books in the best case, from the millions upon millions of books that have been written.
So these "few tricks" are the accumulation of a lifetime of mathematical training, the culmination of the slice of knowledge that the respective mathematician immersed themselves into. To discover new math and become famous you need both the talent and skill to apply your knowledge in novel ways, but also be lucky that you picked a field of math that has novel things with interesting applications to discover plus you picked up the right tools and right mental model that allows you to discover these things.
This does not go for math only, but also for pretty much all other non-trivial fields. There is a reason why history repeats.
And it's actually a compelling argument why AI is still a big deal even though it's at its core a parrot. It's a parrot yes, but compared to a human, it actually was able to ingest the entirety of human knowledge.
1 reply →
> "a broken clock is right twice a day."
The combinatorial nature of trying things randomly means that it would take millennia or longer for light-speed monkeys typing at a keyboard, or GPUs, to solve such a problem without direction.
By now, people should stop dismissing RL-trained reasoning LLMs as stupid, aimless text predictors or combiners. They wouldn’t say the same thing about high-achieving, but non-creative, college students who can only solve hard conventional problems.
Yes, current LLMs likely still lack some major aspects of intelligence. They probably wouldn’t be able to come up with general relativity on their own with only training data up to 1905.
Neither did the vast majority of physicists back then.
> Yes, current LLMs likely still lack some major aspects of intelligence.
Indeed, and so do current humans! And just like LLMs, humans are bad at keeping this fact in view.
On a more serious note, we're going to have a hard time until we can psychologically decouple the concepts of intelligence and consciousness. Like, an existentially hard time.
Yeah, they're great at interpolation - they'll just never be worth much at extrapolation.
Luckily for us, whole fortunes can be made by filling in the blanks between what we know and what we realize.
3 replies →
The ultimate generalist
Also just the sheer value of brute force.
80 hours! 80 hours of just trying shit!
It's 80 minutes, not 80 hours.
6 replies →
How long do you figure it’d take to solve the problem yourself?
Wait, what do you mean "LLMs are still absolutely useless at actual maths computation"? I rely on them constantly for maths (linear algebra, multivariable calc, stat) --- literally thousands of problems run through GPT5 over the last 12 months, and to my recollection zero failures. But maybe you're thinking of something more specific?
They are bad at math. But they are good at writing code and as an optimization some providers have it secretly write code to answer the problem, run it and give you the answer without telling you what it did in the middle part.
5 replies →
What tier are you using? I have run lots of problems and am very impressed, but I find stupid errors a lot more frequently than that, e.g., arithmetic errors buried in a derivation or a bad definition, say 1/15 times. I would love to get zero failures out of thousands of (what sounds like college-level math) posed problems.
1 reply →
calc, stat etc from a text book is something they would naturally be good at but I don't think book based computations thats in the training set and its extrapolations is what is at question here.
They are not great at playing chess as well - computational as well as analytic.
1 reply →
I only have rudimentary understanding of calculus, trigonometry, Google Sheets, and astronomy, but I was able to construct an accurate spreadsheet for astrometry calculations by using Grok and Gemini (both free, no subscription, just my personal account) to surface the formulas for measuring the distance between 2-3 points on the celestial sphere. The LLMs assisted me in also writing functions to convert DMS/HMS coordinates to decimal, and work in radians as well.
I found and fixed bugs I wrote into the formulas and spreadsheets, and the LLMs were not my sole reference, but once the LLM mentioned the names of concepts and functions, I used Wikipedia for the general gist of things, and I appreciated the LLMs' relevant explanations that connected these disciplines together.
I did this on March 14, 2026
I wonder if the rationalizations people come up with for why this isn't real intelligence will be as creative as ChatGPTs solution.
Remember when people thought multiplying numbers, remembering a large number of facts, and being good at rote calculations was intelligence?
Some people think that multiplying numbers, remembering a large number of facts, and being good at calculations is intelligence.
Most intelligent people do not think that.
Eventually, we will arrive at the same conclusion for what LLMs are doing now.
Remember when people thought solving Erdos problems required intelligence? Is there anything an LLM could ever do that would cound as intelligence? Surely the trend has to break at some point, if so what would be the thing that crosses the line to into real intelligence?
2 replies →
LLMs are definitely intelligent - just not general like humans, and very very jagged (succeedingand failing in head-scratching ways).
None of it is really from logical thought. The rationalizations don't make any sense, but they haven't for a while. It's an emotional response. Honestly, It's to be expected.
It's because HN is not really full of smart people. It's full of people who think they're smart and take pride in that idea that they're pretty intelligent.
ChatGPT equalizes intelligence. And that is an attack on their identity. It also exposes their ACTUAL intelligence which is to say most of HN is not too smart.
Well it still gets easy problems wrong
With real general intelligence you'd expect it to solve problems above a certain difficulty with a good clip
That "it" is a huge variety and range of things...
And how about the creative rationalizations about how statistical text generation is actual intelligence? As if there is any intent or motive behind the words that are generated or the ability to learn literally any new thing after it has been trained on human output?
2022 called, wants this argument back. When you're "statistically generating text" to find zero-day vulnerabilities in hard targets, building Linux kernel modules, assembly-optimizing elliptic curve signature algorithms, and solving arbitrary undergraduate math problems instantaneously --- not to mention apparently solving Erdos problems --- the "statistical text" stuff has stopped being a useful description of what's happening, something closer to "it's made of atoms and obeys the laws of thermodynamics" than it is to "a real boundary condition of what it can accomplish".
I don't doubt that there are many very real and meaningful limitations of these systems that deserve to be called out. But "text generation" isn't doing that work.
5 replies →
Solving open math problems is strong evidence of intelligence so there's not really any need for rationalization? I don't understand why intelligence would require intent or motive? Isn't intent just the behaviour of making a specific thing happen rather than other things?
4 replies →
Everybody who retried the problem on ChatGPT spent on the order of the same amount of time on it.
Do you not see the issue?
No, but I'm interested to know what it is?
For one, everything its 'intelligence' knows about solving the problem is contained within the finite context window memory buffer size for the particular model and session. Unless the memory contents of the context window are being saved to storage and reloaded later, unlike a human, it won't "remember" that it solved the problem and save its work somewhere to be easily referenced later.
There's humans that have memory issues, or full blown Anterograde amnesia.
1 reply →
For one, everything humans' "intelligence" knows about solving the problem is contained within the finite brain size for the particular person and life. Unless the memory contents of the brain are being saved to storage and reloaded later, it won't "remember" that it solved the problem and save its work somewhere to be easily referenced in a different life.
What your describing sounds more like the model is lacking awareness than lacking intelligence? Why does it need to know it solved the problem to be intelligent?
3 replies →
As another commenter pointed out these models are being trained how to save and read context into files so denying them to use such an ability that they have just makes your claim tautological.
All modern harnesses write memory files for context later.
I think one day the VCs will have given the monkeys on typewriters enough money that these kinds of comments can be generated without human intervention.
[dead]
You're really telling on yourself if you think LLM is intelligence
This is real intelligence is the bear position, so I think it’s real intelligence.
Now do P vs NP.
If/when these things solve our hardest problems, that's going to lead to some very uncomfortable conversations and realizations.
This is not a good Saturday night for humanity
referring to Tao as just a 'mathematician' gave me a good chuckle
WTF!?
[dead]
Big if true.
My big question with all these announcements is: How many other people were using the AI on problems like this, and, failing? Given the excitement around AI at the moment I think the answer is: a lot.
Then my second question is how much VC money did all those tokens cost.
I've tried my hand at a few of the Erdos problems and came up short, you didn't hear about them. But if a Mathematician at Harvard solved on, you would probably still hear about it a bit. Just the possibility that a pro subscription for 80 minutes solved an Erdos problem is astounding. Maybe we get some researchers to get a grant and burn a couple data centers worth of tokens for a day/week/month and see what it comes up with?
Can you imagine how many bags of chips we could buy if we stopped funding cancer research?
It's so expensive!
Can you imagine how much ChatGPT cancer research we could fund if we stopped funding cancer research?
Why do you care about either of those questions?
Because it could be a massive waste of time and money.
2 replies →
I think we should at least ask the latter, if it turned out it cost $100,000 to generate this solution, I would question the value of it. Erdős problems are usually pure math curiosities AFAIK. They often have no meaningful practical applications.
11 replies →
> He’s 23 years old and has no advanced mathematics training.
How is he even posing the question and having even a vague idea of what the proof means or how to understand it?
> “I didn’t know what the problem was—I was just doing Erdős problems as I do sometimes, giving them to the AI and seeing what it can come up with,” he says. “And it came up with what looked like a right solution.” He sent it to his occasional collaborator Kevin Barreto, a second-year undergraduate in mathematics at the University of Cambridge.
Seems like standard 23 year old behavior. You're spending $100-$200/mo on the pro subscription, and want to get your money's worth. So you burn some tokens on this legendarily hard math problem sometimes. You've seen enough wrong answers to know that this one looks interesting and pass it on to a friend that actually knows math, who is at a place where experts can recognize it as correct.
Seems like a classic example of in-expert human labeling ML output.
Couldn't he have just asked ChatGPT if it was correct? Why do we still feel the need to loop in a human?
my guess would be due to having an interest in the field
Scientific American going out of business next lol, weak headline. Chat GPT let's have a better headline for the God among Men that realized the capability of the new tool, many underestimate or puff up needlessly. Fun times we live in. One love all.
This just shows that with the right training, in this case a thesis on erdos problems, they where able to prompt and check the output. So still needed the know how to even being to figure it out. "Lichtman proved Erdős right as part of his doctoral thesis in 2022."
Lichtman is an expert who commented for the story. Liam Price is the one who prompted ChatGPT. "He’s 23 years old and has no advanced mathematics training."
“I didn’t know what the problem was—I was just doing Erdős problems as I do sometimes, giving them to the AI and seeing what it can come up with,” he says. “And it came up with what looked like a right solution.”
"He sent it to his occasional collaborator Kevin Barreto, a second-year undergraduate in mathematics at the University of Cambridge."
So basically two undergrads/graduates in math, "advanced" is subjective at that point.
2 replies →