← Back to context

Comment by jonahx

18 hours ago

As a side question, do you think using tools like Lean will become a staple of these "deep reasoning" LLM flavors?

It seems that LLMs excel (relative to other paradigms) in the kind of "loose" creative thinking humans do, but are also prone to the same kinds of mistakes humans make (hallucinations, etc). Just as Lean and other formal systems can help humans find subtle errors in their own thinking, they could do the same for LLMs.

I was surprised to see them not using tools for it, that feels like a more reliable way to get useful results for this kind of thing.

I get the impression not using tools is as part of the point though - to help demonstrate how much mathematical "reasoning" you can get out of just a model on its own.

  • Yes, I'm similarly surprised. Intuitively I'd think that it's much better to train on using Lean, since it's much easier to do RL on it (Lean gives you an objective metric on whether you achieved your objective). It also seems more useful in some ways.

    But all the model providers are putting emphasis on the "this is only using natural language" angle, which I think is interesting both from a "this is easier for humans to actually use" perspective, but also comes from a place of "look how general the model is".