Comment by sigbottle
3 hours ago
I think that a lot of models have to sprinkle in a lot of "fluff" in their thinking to stay within the right distribution. They only have language as their only medium; the way we annotate context is via brackets and then training them to hopefully respect the brackets. I'd imagine that either top labs explicitly train, or through the RL process the models implicitly learn, to spam tokens to keep them 'within distribution' since everything's going through the same channel and there's no fine grained separation between things.
Philosophically, it's not like you're a detached observer who simply reasons over all possible hypotheses. Ever get stuck in a dead end and find it hard to dig yourself out? If you were a detached observer, it'd be pretty easy to just switch gears. But it's not (for humans).
Language really only exists at the input and output surfaces of the models. In the middle it's all numerical values. Which you might be quick in relating to just being a numeric cypher of the words, which while not totally false, it misses that it is also a numeric cypher of anything. You can train a transformer on anything that you can assign tokens to.