Comment by xpe
11 hours ago
> Can't you know that tokens are units of thinking just by... like... thinking about how models work?
Seems reasonable, but this doesn't settle probably-empirical questions like: (a) to what degree is 'more' better?; (b) how important are filler words? (c) how important are words that signal connection, causality, influence, reasoning?
Right, there's probably something more subtle like "semantic density within tokens is how models think"
So it's probably true that the "Great question!---" type preambles are not helpful, but that there's definitely a lower bound on exactly how primitive of a caveman language we're pushing toward.