Show HN: Another experiment with an Erdos problem and LLMs

2 hours ago

Background: I am a coder, not a mathematician, but I was quite entertained by this story:

https://www.erdosproblems.com/691

  Given A\subseteq \mathbb{N} let M_A=\{ n \geq 1 : a\mid n\textrm{ for some }a\in A\} be the set of multiples of A. 
  Find a necessary and sufficient condition on A for M_A to have density 1.

My approach: I used DeepSeek in Expert mode, using the same prompt as in the linked HN submission. It thought for a very long time, but I was doing other things in the background so I didn't really time it. I pressed "Continue" twice over the space of maybe 60mins. The output says it thought for about 46mins.

Once it generated a proof, I asked Opus 4.7 to review it, and then entered the review into DeepSeek which made edits, corrections and refinements. This back-and-forth continued till Opus 4.7 was reasonably happy. At that point, I called in Gemini 3.1 Pro Preview, which raised issues which Opus missed. Opus acknowledged the feedback, and then I placed its feedback into Deepseek for a final round. Essentially, what Opus says Deepseek generated was a "clean exposition of a D[avenport]-E[rdos] corollary", not a new result. In all likelihood this result may already be known (Deepseek was not allowed to use the internet for this phase), or even wrong.

In "simple" terms:

  The argument actually proves a stronger fact for every set \( A \) of natural numbers:  
  The upper density of the set \( M_A \) equals the largest possible lower density you can get from finite subsets of \( A \), and that also equals the lower density of \( M_A \).
  When the upper density is 1, it forces the lower density to also be 1, so the natural (ordinary) density exists and equals 1 automatically, without needing any extra conditions.
  The only non-basic part of the proof is the Davenport–Erdős theorem; everything else is simple.

In any case, these were my takeaways:

- These new models seem to be surprisingly capable especially when used to in conjunction with each other, even with fairly simple prompts

- I am quite impressed by Deepseek. I'm going to review its coding ability, and may even switch completely from Anthropic

- This was a genuinely interesting exercise, even if I have no idea if any of it is correct or useful

Some other observations:

- Opus was really fast at reviewing Deepseek's output. Literally seconds

- Gemini had trouble figuring out what "Erdos 691" referred to

- The free version of ChatGPT of generated mostly useless output. I didn't include it.

Chat links below:

https://chat.deepseek.com/share/hpguvrhcxn226bi3hn

https://claude.ai/share/4f3ccad1-d862-4e37-8333-8a1ebd84b38f

https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...

0 comments

ilitirit

No comments yet

Contribute on Hacker News ↗