Comment by 8note
6 months ago
this sounds like a fun research area. do LLMs have plans about future tokens?
how do we get 100 tokens of completion, and not just one output layer at a time?
are there papers youve read that you can share that support the hypothesis? vs that the LLM doesnt have ideas about the future tokens when its predicting the next one?
This research has been done, it was a core pillar of the recent Anthropic paper on token planning and interpretability.
https://www.anthropic.com/research/tracing-thoughts-language...
See section “Does Claude plan its rhymes?”?
Lol... Try building systems off them and you will very quickly learn concretely that they "plan".
It may not be as evident now as it was with earlier models. The models will fabricate preconditions needed to output the final answer it "wanted".
I ran into this when using quasi least-to-most style structured output.