Comment by UniverseHacker

1 year ago

I'm not sure if you read the entirety of my comment? Increasingly accurately predicting the next symbol given a sequence of previous symbols, when the symbols represent a time series of real world events, requires increasingly accurately modeling- aka understanding- the real world processes that lead to the events described in them. There is provably no shortcut there- per Solomonoff's theory of inductive inference.

It is a misunderstanding to think of them as fundamentally separate and mutually exclusive, and believing that to be true makes people convince themselves that they cannot possibly ever do things which they can already provably do.

Noam Chomsky (embarrassingly) wrote a NYT article on how LLMs could never, with any amount of improvements be able to answer certain classes of questions - even in principle. This was days before GPT-4 came out, and it could indeed correctly answer the examples he said could not be ever answered- and any imaginable variants thereof.

Receiving symbols and predicting the next one is simply a way of framing input and output that enables training and testing- but doesn't specify or imply any particular method of predicting the symbols, or any particular level of correct modeling or understanding of the underlying process generating the symbols. We are both doing exactly that right now, by talking online.

> I'm not sure if you read the entirety of my comment?

I did, and I tried my best to avoid imposing preconceived notions while reading. You seem to be equating "being able to predict the next symbol in a sequence" with "possessing a deep causal understanding of the real-world processes that generated that sequence", and if that's an inaccurate way to characterize your beliefs I welcome that feedback.

Before you judge my lack of faith too harshly, I am a fan of LLMs, and I find this kind of anthropomorphism even among technical people who understand the mechanics of how LLMs work super-interesting. I just don't know that it bodes well for how this boom ends.

  • > You seem to be equating "being able to predict the next symbol in a sequence" with "possessing a deep causal understanding of the real-world processes that generated that sequence"

    More or less, but to be more specific I would say that increasingly accurately predicting the next symbols in a massive set of diverse sequences, which explain a huge diversity of real world events described in sequential order, requires increasingly accurate models of the underlying processes of said events. When constrained with a lot of diversity and a small model size, it must eventually become something of a general world model.

    I am not understanding why you would see that as anthropomorphism- I see it as quite the opposite. I would expect something non-human that can accurately predict outcomes of a huge diversity of real world situations based purely on some type of model that spontaneously develops by optimization- to do so in an extremely alien and non-human way that is likely incomprehensible in structure to us. Having an extremely alien but accurate way of predicatively modeling events that is not subject to human limitations and biases would be, I think, incredibly useful for escaping limitations of human thought processes, even if replacing them with other different ones.

    I am using modeling/predicting accurately in a way synonymous with understanding, but I could see people objecting to the word 'understanding' as itself anthropomorphic... although I disagree. It would require a philosophical debate on what it means to understand something I suppose, but my overall point still stands without using that word at all.

    • > specific I would say that increasingly accurately predicting the next symbols in a massive set of diverse sequences, which explain a huge diversity of real world events described in sequential order, requires increasingly accurate models of the underlying processes of said events

      But it doesn’t - it’s a statistical model using training data, not a physical or physics model, which you seem to be equating it to (correct me if I am misunderstanding)

      And in response to the other portion you present, an LLM fundamentally can’t be alien because it’s trained on human produced output. In a way, it’s a model of the worst parts of human output - garbage in, garbage out, as they say - since it’s trained on the corpus of the internet.

      1 reply →

    • > I would say that increasingly accurately predicting the next symbols in a massive set of diverse sequences, which explain a huge diversity of real world events described in sequential order, requires increasingly accurate models of the underlying processes of said events.

      I disagree. Understanding things is more than just being able to predict their behaviour.

      Flat Earthers can still come up with a pretty good idea of where (direction relative to the vantage point) and when the Sun will appear to rise tomorrow.

      1 reply →