← Back to context

Comment by trhway

2 days ago

>A toddler can learn by trial and error mid-process.

as a result of the whole learning process the toddler in particular learns how to self-correct itself, ie. as a grown up s/he knows, without much trial and errors anymore, how to continue in straight line if the previous step went sideways for whatever reason

>An LLM using autoregressive inference can only compound errors.

That is pretty powerful statement completely dismissing that some self-correction may be emerging there.

Can you expand on that? I don't see where it could emerge from.

  • the LLM handles/steers the representation (trajectory consisting of successive representations) in a very high-dimensional space. For example, it is very possible that those trajectories can, as a result of the learning, be driven by the minimizing distance (or some other metric) from some fact(s) representation.

    The metric may be including say a weight/density of the attracting facts cluster - somewhat like gravitation drives the stuff in the Universe with the LLM learning can be thought as pre-distributing matter in its own that very high-dimensional Universe according to the semantic "gravitational" field.

    The resulting - emerging - metric and associated geometry is currently mind-boggling incomprehensible, and even in much-much simpler, single-digit dimensional, spaces systems described by Lecun still can be [quasi]stable and/or [quasi]periodic around say some attractor(s).