← Back to context

Comment by accountnum

5 months ago

They're not sampling from prior conversations. The model constructs abstracted representations of the domain-specific reasoning traces. Then it applies these reasoning traces in various combinations to solve unseen problems.

If you want to call that sampling, then you might as well call everything sampling.

They're generative models. By definition, they are sampling from a joint distribution of text tokens fit by approximation to an empirical distribution.

  • Again, you're stretching definitions into meaninglessness. The way you are using "sampling" and "distribution" here applies to any system processing any information. Yes, humans as well.

    I can trivially define the entirety of all nerve impulses reaching and exiting your brain as a "distribution" in your usage of the term. And then all possible actions and experiences are just "sampling" that "distribution" as well. But that definition is meaningless.

    • No, causation isnt distribution sampling. And there's a difference between, say, an extrinsic description of a system and it's essential properties.

      Eg., you can describe a coin flip as a sampling from the space, {H,T} -- but insofar as we're talking about an actual coin, there's a causal mechanism -- and this description fails (eg., one can design a coin flipper to deterministically flip to heads).

      In the case of a transformer model, and all generative statistical models, these are actually learning distributions. The model is essentially constituted by a fit to a prior distribution. And when computing a model output, it is sampling from this fit distribution.

      ie., the relevant state of the graphics card which computes an output token is fully described by an equation which is a sampling from an empirical distribution (of prior text tokens).

      Your nervous system is a causal mechanism which is not fully described by sampling from this outcome space. There is no where in your body that stores all possible bodily states in an outcome space: this space would require more atoms in the universe to store.

      So this isn't the case for any causal mechanism. Reality itself comprises essential properties which interact with each other in ways that cannot be reduced to sampling. Statistical models are therefore never models of reality essentially, but basically circumstantial approximations.

      I'm not stretching definitions into meaninglessness, these are the ones given by AI researchers, of which I am one.

      5 replies →