Comment by accountnum

10 months ago

I'm going to simply address what I think are your main points here.

There is nowhere that an LLM stores all possible outputs. Causality can trivially be represented by sampling by including the ordering of events, which you also implicitly did for LLMs. The coin is an arbitrary distinction, you are never just modeling a coin, just as an LLM is never just modeling a word. You are also modeling an environment, and that model would capture whatever you used to influence the coin toss.

You are fundamentally misunderstanding probability and randomness, and then using that misunderstanding to arbitrarily imply simplicity in the system you want to diminish, while failing to apply the same reasoning to any other.

If you are indeed an AI researcher, which I highly doubt without you providing actual credentials, then you would know that you are being imprecise and using that imprecision to sneak in unfounded assumptions.

4 comments

accountnum

mjburgess 10 months ago

LLMs are just modelling token order. The weights are a compression of the outcome space.

No, causality is not just an ordering.

accountnum 10 months ago
[flagged]
- mjburgess 10 months ago
  
  It's not a matter of making points, it's at least a semester's worth of courses on causal analysis, animal intelligence, the scientific method, explanation.
  Causality isnt ordering. Take two contrary causal mechanisms (eg., filling a bathtube with a hose, and emptying it with a bucket). The level of the bath is arbitrarily orderable with respect to either of these mechanisms.
  cf. https://en.wikipedia.org/wiki/Collider_(statistics)
  Go on youtube and find people growing a nervous system in a lab, and you'll notice its an extremely plastic, constantly physically adapting, and so on system. You'll note the very biochemcial "signalling" you're talking about itself is involved in the change to the physical structure of the system.
  This physical structure does not encode all prior activations of the system, nor even a compression of them.
  To see this consider Plato's cave. Outside the cave passes by a variety of objects which cast a shadow on the wall. The objects themselves are not compressions of these shadows. Inside the cave, you can make one of these yourself: take clay from the floor and fashion a pot. This pot, like the one outside, are not compressions of their shadows.
  All statistical algorithms which average over historical cases are compressions of shadows, and replay these shadows on command, ie., they learn the distribution of shadows and sample from this distribution demand.
  Animals, and indeed all science, is not concerned with shadows. We don't model patterns in the night sky -- this is astrology -- we model gravity: we build pots.
  The physical structure of our bodies encodes their physical structure and that of reality itself. They do so by sensor-motor modulation of organic processes of physical adaption. If you like: our bodies are like clay and this is fashioned by reality into the right structure.
  In any case, we haven't the time or space to convince you of this formally. Suffice it to say that it is a very widespread consensus that modelling conditional probabilities with generative models fails to model causality. You can read Judea Pearl on this if you want to understand more.
  Perhaps more simply: a video game model of a pot can generate an infinite number of shadows in an infinite number of conditions. And no statistical algorithm with finite space and finite time requirements will ever model this video game. The video game model does not store a compression of past frames -- since it has a real physical model, it can create new frames from this model.
  
  1 reply →