← Back to context

Comment by Fargren

10 days ago

An LLM is not a universal Turing machine, though. It's a specific family of algorithms.

You can't build an LLM that will factorize arbitrarily large numbers, even in infinite time. But a Turing machine can.

To make a universal Turing machine out of an LLM only requires a loop and the ability to make a model that will look up a 2x3 matrix of operations based on context and output operations to the context on the basis of them (the smallest Turing machine has 2 states and 3 symbols or the inverse).

So, yes, you can.

Once you have a (2,3) Turing machine, you can from that build a model that models any larger Turing machine - it's just a question of allowing it enough computation and enough layers.

It is not guaranteed that any specific architecture can do it efficiently, but that is entirely besides the point.

  • LLMs cannot loop (unless you have a counterexample?), and I'm not even sure they can do a lookup in a table with 100% reliability. They also have finite context, while a Turing machine can have infinite state.

    • If your argument is that a system incorporating a model is not an LLM if there is a loop around it, then reasoning models are not LLMs.

      They can do lookup in a table with 100% reliability, yes, because you can make then 100% deterministic if you wish by using numerically stable inferencing code and setting temperature to 0.

      Finite context is irrelevant, because the context can be used as an IO channel.

      A Turing machine does not have infinite state within the mechanism itself - it requires access to a potentially infinite tape. A Turing machine can be constructed with down to 1 bit of state (a (2,3) or (3,2) Turing machine are the smalles possible, where one number represents the number of states, and the other represents number of discrete symbols it can handle).

      An IO channel is computationally equivalent to an infite tape, and unlike an infinite tape, an IO channel is physically possible.

  • Are you saying that LLMs are Turing complete or did I misunderstand it?

    • An LLM in itself is inert - it's just the model, so when talking about an LLM doing anything it is doing so as part of an inference engine. An inference system with a loop is trivially Turing complete if you use the context as an IO channel, use numerically stable inferencing code, and set temperature to 0 - in that case, all you need is for the model to encode a 6 entry lookup table to operate the "tape" via context.