Comment by razodactyl
2 days ago
A viable consideration is that the models will hone in on and reinforce an incorrect answer - a natural side effect of the LLM technology wanting to push certain answers higher in probability and repeat anything in context.
Regardless of being in conversation or thinking context this doesn't prevent the model from speaking the wrong answer so the paper on the illusion of thinking makes sense.
What actually seems to be happening is a form of conversational prompting. Of course with the right conversation back and forth with an LLM you can inject knowledge in a way that causes the natural distribution to shift (again - side effect of the LLM tech.) but by itself it won't naturally get the answer perfect every time.
If this extended thinking were actually working you would expect the LLM to be able to logically conclude an answer with very high accuracy 100% of the time which it does not.
No comments yet
Contribute on Hacker News ↗