Comment by falcor84

5 hours ago

> This requires logic, which LLMs lack.

What? I've heard many takes on what AI lacks, but never this one. We had ChatGPT being able to solve an Erdős problem on its own yesterday [0]; how could you explain that if it cannot do logic?

[0] https://news.ycombinator.com/item?id=47903126

LLM didn't solve an Erdos problem, it generated a text that a human looked at, cleaned up, corrected and used as base for a solution.

WRT logic, there a multiple occasions of LLMs answering incorrectly to trivial logic puzzles. Of course, with each occasion becoming public they are added to training data and overfitted on, but if you embed them in a more subtle way LLMs will fail again.

  • From the article about the Erdos problem:

    > “This one is a bit different because people did look at it, and the humans that looked at it just collectively made a slight wrong turn at move one,” says Terence Tao, a mathematician at the University of California, Los Angeles, who has become a prominent scorekeeper for AI’s push into his field. “What’s beginning to emerge is that the problem was maybe easier than expected, and it was like there was some kind of mental block.”

    > “There was kind of a standard sequence of moves that everyone who worked on the problem previously started by doing,” Tao says. The LLM took an entirely different route, using a formula that was well known in related parts of math, but which no one had thought to apply to this type of question.

    > “The raw output of ChatGPT’s proof was actually quite poor. So it required an expert to kind of sift through and actually understand what it was trying to say,” Lichtman says. But now he and Tao have shortened the proof so that it better distills the LLM’s key insight.

    > More importantly, they already see other potential applications of the AI’s cognitive leap. “We have discovered a new way to think about large numbers and their anatomy,” Tao says. “It’s a nice achievement. I think the jury is still out on the long-term significance.”

    You can debate whether the LLM used logic or not. I don't think you can debate that the LLM has in this case elevated human thinking, by leading us to a solution that had eluded world-class mathematicians for 60 years. And a new way to think "about large numbers and their anatomy".

    And if it works for Terrence Tao and Erdos problems, then I'm certainly not above using AI to help brainstorm solutions for my little app at work.

    • Sure, LLMs are good at generating text that humans can interpret as educated guesses. But a list of educated guesses is not 'enumerating options', because informed decision requires a complete list of options in order to not miss anything. Imagine using a Monte-Carlo method with sample size of 3 for finding a function extremum - that's the equivalent of using LLM-generated list of options for making a decision.

  • > WRT logic, there a multiple occasions of LLMs answering incorrectly to trivial logic puzzles.

    There are multiple occasions of me answering incorrectly to trivial logic puzzles. Is that enough for you to deduce that I am "lack" logic?

    Humans make mistakes all the time, and indeed we say "To err is human"; why should we expect AI not to?