Comment by layer8
3 months ago
What’s missing in that analogy is that humans tend to have a good hunch about when they have to think more and when they are “done”. LLMs seem to be missing a mechanism for that kind of awareness.
3 months ago
What’s missing in that analogy is that humans tend to have a good hunch about when they have to think more and when they are “done”. LLMs seem to be missing a mechanism for that kind of awareness.
LLMs actually do have such hunch, they just don't utilize it. You can literally ask them "Would you do better if you started over?" and start over if answer is yes. This works.
https://arxiv.org/abs/2410.02725
Great observation. Maybe an additional “routing model” could be trained to predict when it’s better to think more vs just using the current result