Comment by phailhaus

4 months ago

> People need to let go of this strange and erroneous idea that humans somehow have this privileged access to the 'real world'.

This is irrelevant, the point is that you do have access to a world which LLMs don't, at all. They only get the text we produce after we interact with the world. It is working with "compressed data" at all times, and have absolutely no idea what we subconsciously internalized that we decided not to write down or why.

1 comment

phailhaus

famouswaffles 4 months ago

All of the SOTA LLMs today are trained on more than text.

It doesn't matter whether LLMs have "complete" (nothing does) or human-like world access, but whether the compression in text is lossy in ways that fundamentally prevent useful world modeling or reconstruction. And empirically... it doesn't seem to be. Text contains an enormous amount of implicit structure about how the world works, precisely because humans writing it did interact with the world and encoded those patterns.

And your subconscious is far leakier than you imagine. Your internal state will bleed into your writing, one way or another whether you're aware of it or not. Models can learn to reconstruct arithmetic algorithms given just operation and answer with no instruction. What sort of things have LLMs reconstructed after being trained on trillions of tokens of data ?