← Back to context

Comment by libraryofbabel

2 days ago

2026 should be interesting. This stuff is not magic, and progress is always going to be gradual with solutions to less interesting or "easier" problems first, but I think we're going to see more milestones like this with AI able to chip away around the edges of unsolved mathematics. Of course, that will require a lot of human expertise too: even this one was only "solved more or less autonomously by AI (after some feedback from an initial attempt)".

People are still going to be moving the goalposts on this and claiming it's not all that impressive or that the solution must have been in the training data or something, but at this point that's kind of dubiously close to arguing that Terence Tao doesn't know what he's talking about, which to say the least is a rather perilous position.

At this point, I think I'm making a belated New Years resolution to stop arguing with people who are still staying that LLMs are stochastic parrots that just remix their training data and can never come up with anything novel. I think that discussion is now dead. There are lots of fascinating issues to work out with how we can best apply LLMs to interesting problems (or get them to write good code), but to even start solving those issues you have to at least accept that they are at least somewhat capable of doing novel things.

In 2023 I would have bet hard against us getting to this point ("there's no way chatbots can actually reason their way through novel math!"), but here we are are three years later. I wonder what comes next?

Uh, this was exactly a "remix" of similar proofs that most likely were in the training data. It's just that some people misunderestimate how compelling that "remix" ability can be, especially when paired with a direct awareness of formal logical errors in one's attempted proof and how they might be addressed in the typical case.

  • Then what sort of math problem would be a milestone for you where an AI was doing something novel?

    Or are you just saying that solving novel problems involves remixing ideas? Well, that's true for human problem solving too.

    • > Then what sort of math problem would be a milestone for you where an AI was doing something novel?

      What? If we're discussing novel synthesis, and it's being contrasted with answer-from-search / answer-from-remix.. the problem does not matter. Only the answer and the originality of the approach. Connecting two fields that were not previously connected is novel, or applying a new kind of technique to an old problem. Recognizing that an unsolved problem is very much like a solved one is search / remix. So what happened here? Tao says it is

      > is largely consistent with other recent demonstrations of AI using existing methods to resolve Erdos problem

      Existing. Methods. Tao also says "This is a demonstration of the genuine increase in capability of these tools in recent months". This is the sentence everyone will focus on, so what is that capability?

      > the more interesting capability revealed by these events is the ability to rapidly write and rewrite new versions of a text as needed, even if one was not the original author of the argument.

      Rejoice! But rejoice for the right reasons, and about what actually happened. Style and voice transformations, interesting new capabilities for fuzzy search. Correct usage of external tools for heavy-lifting with symbolics. And yes, actual problem solving. Novel techniques, creativity, originality though? IDK, sounds kind of optimistic based on the detail here.

      3 replies →

The goalposts are still the same. We want to be able to independently verify that an AI can do something instead of just hearing such a claim from a corporation that is absolutely willing to lie through their teeth if it gets them money.

  • Not disagreeing with you, but I don't think Tao is blowing this out of proportion either. I think it's a pretty reasonable way of saying, "Hey, AI is now capable of something it wasn't able to do before".

  • Terrance Tao isn’t part of any AI corporation though? He’s purely a celebrated academic telling us this checks out.

    • From his posts, it’s unclear who actually did the experiment. He seems to only be commenting on the results? Or am I missing something?