Show HN: Another experiment with an Erdos problem and LLMs
2 hours ago
Background: I am a coder, not a mathematician, but I was quite entertained by this story:
https://www.erdosproblems.com/691
Given A\subseteq \mathbb{N} let M_A=\{ n \geq 1 : a\mid n\textrm{ for some }a\in A\} be the set of multiples of A.
Find a necessary and sufficient condition on A for M_A to have density 1.
My approach: I used DeepSeek in Expert mode, using the same prompt as in the linked HN submission. It thought for a very long time, but I was doing other things in the background so I didn't really time it. I pressed "Continue" twice over the space of maybe 60mins. The output says it thought for about 46mins.
Once it generated a proof, I asked Opus 4.7 to review it, and then entered the review into DeepSeek which made edits, corrections and refinements. This back-and-forth continued till Opus 4.7 was reasonably happy. At that point, I called in Gemini 3.1 Pro Preview, which raised issues which Opus missed. Opus acknowledged the feedback, and then I placed its feedback into Deepseek for a final round. Essentially, what Opus says Deepseek generated was a "clean exposition of a D[avenport]-E[rdos] corollary", not a new result. In all likelihood this result may already be known (Deepseek was not allowed to use the internet for this phase), or even wrong.
In "simple" terms:
The argument actually proves a stronger fact for every set \( A \) of natural numbers:
The upper density of the set \( M_A \) equals the largest possible lower density you can get from finite subsets of \( A \), and that also equals the lower density of \( M_A \).
When the upper density is 1, it forces the lower density to also be 1, so the natural (ordinary) density exists and equals 1 automatically, without needing any extra conditions.
The only non-basic part of the proof is the Davenport–Erdős theorem; everything else is simple.
In any case, these were my takeaways:
- These new models seem to be surprisingly capable especially when used to in conjunction with each other, even with fairly simple prompts
- I am quite impressed by Deepseek. I'm going to review its coding ability, and may even switch completely from Anthropic
- This was a genuinely interesting exercise, even if I have no idea if any of it is correct or useful
Some other observations:
- Opus was really fast at reviewing Deepseek's output. Literally seconds
- Gemini had trouble figuring out what "Erdos 691" referred to
- The free version of ChatGPT of generated mostly useless output. I didn't include it.
Chat links below:
https://chat.deepseek.com/share/hpguvrhcxn226bi3hn
https://claude.ai/share/4f3ccad1-d862-4e37-8333-8a1ebd84b38f
https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...
No comments yet
Contribute on Hacker News ↗