← Back to context

Comment by stonogo

2 days ago

Every stage of this 3-stage pipeline is an LLM.

1. "The search algorithm is a highly parallel Monte Carlo Graph Search (MCGS) using a large transformer as its policy and value functon." ... "We use a generative policy to take progressively widened [7] samples from the large action space of Lean tactics, conditioning on the Lean proof state, proof history, and, if available, an informal proof. We use the same model and prompt (up to a task token) to compute the value function which guides the search."

See that 'large transformer' phrase? That's where the LLM is involved.

2. "A lemma-based informal reasoning system which generates informal proofs of mathematical state-ments, breaks these proofs down into lemmas, formalizes each lemma into Lean, and iterates this process based on formal feedback" ... "First, the actions it generates consist of informal comments in addition to Lean tactics. Second, it uses a hidden chain of thought with a dynamically set thinking budget before predicting an action."

Unless you're proposing that this team solved AGI, "chain of thought" is a specific term of art in LLMs.

3. "A geometry solver which solves plane geometry problems outside of Lean using an approach based on AlphaGeometry [45]." ... following the reference: "AlphaGeometry is a neuro-symbolic system that uses a neural language model, trained from scratch on our large-scale synthetic data, to guide a symbolic deduction engine through infinite branching points in challenging problems. "

AlphaGeometry, like all of Deepmind's Alpha tools, is an LLM.

Instead of accusing people of not reading the paper, perhaps you should put some thought into what the things in the paper actually represent.

If you think "transformer" = LLM, you don't understand the basic terminology of the field. This is like calling AlphaFold an LLM because it uses a transformer.

  • No, it isn't. They call out ExIt as an inspiration as well as AlphaZero, and the implementation of these things (available in many of their authors' papers) is almost indistinguishable from LLMs. The architecture isn't novel, which is why this paper is about the pipeline instead of about any of the actual processing tools. Getting prickly about meaningless terminology differences is definitely your right, but for anyone who isn't trying to define a policy algorithm for a transformer network, the difference is immaterial to understanding the computation involved.

    • Equating LLMs and transformers is not a meaningless terminology difference at all, Aristotle is so different from the things people call LLMs in terms of training data, loss function, and training that this is a grievous error.