Comment by layer8

5 months ago

What’s missing in that analogy is that humans tend to have a good hunch about when they have to think more and when they are “done”. LLMs seem to be missing a mechanism for that kind of awareness.

2 comments

layer8

sanxiyn 5 months ago

LLMs actually do have such hunch, they just don't utilize it. You can literally ask them "Would you do better if you started over?" and start over if answer is yes. This works.

https://arxiv.org/abs/2410.02725

nico 5 months ago

Great observation. Maybe an additional “routing model” could be trained to predict when it’s better to think more vs just using the current result