Comment by accountnum

10 months ago

Again, you're stretching definitions into meaninglessness. The way you are using "sampling" and "distribution" here applies to any system processing any information. Yes, humans as well.

I can trivially define the entirety of all nerve impulses reaching and exiting your brain as a "distribution" in your usage of the term. And then all possible actions and experiences are just "sampling" that "distribution" as well. But that definition is meaningless.

6 comments

accountnum

mjburgess 10 months ago

No, causation isnt distribution sampling. And there's a difference between, say, an extrinsic description of a system and it's essential properties.

Eg., you can describe a coin flip as a sampling from the space, {H,T} -- but insofar as we're talking about an actual coin, there's a causal mechanism -- and this description fails (eg., one can design a coin flipper to deterministically flip to heads).

In the case of a transformer model, and all generative statistical models, these are actually learning distributions. The model is essentially constituted by a fit to a prior distribution. And when computing a model output, it is sampling from this fit distribution.

ie., the relevant state of the graphics card which computes an output token is fully described by an equation which is a sampling from an empirical distribution (of prior text tokens).

Your nervous system is a causal mechanism which is not fully described by sampling from this outcome space. There is no where in your body that stores all possible bodily states in an outcome space: this space would require more atoms in the universe to store.

So this isn't the case for any causal mechanism. Reality itself comprises essential properties which interact with each other in ways that cannot be reduced to sampling. Statistical models are therefore never models of reality essentially, but basically circumstantial approximations.

I'm not stretching definitions into meaninglessness, these are the ones given by AI researchers, of which I am one.

accountnum 10 months ago
I'm going to simply address what I think are your main points here.
There is nowhere that an LLM stores all possible outputs. Causality can trivially be represented by sampling by including the ordering of events, which you also implicitly did for LLMs. The coin is an arbitrary distinction, you are never just modeling a coin, just as an LLM is never just modeling a word. You are also modeling an environment, and that model would capture whatever you used to influence the coin toss.
You are fundamentally misunderstanding probability and randomness, and then using that misunderstanding to arbitrarily imply simplicity in the system you want to diminish, while failing to apply the same reasoning to any other.
If you are indeed an AI researcher, which I highly doubt without you providing actual credentials, then you would know that you are being imprecise and using that imprecision to sneak in unfounded assumptions.
- mjburgess 10 months ago
  
  LLMs are just modelling token order. The weights are a compression of the outcome space.
  No, causality is not just an ordering.
  
  3 replies →