Comment by BloondAndDoom
17 hours ago
Why do you think that? Given how the attention and optimization works on training and inference it makes sense that these kind of words trigger deeper analysis (more steps, introducing more thinking/reasoning steps which wield indeed yield less problems. Even if you make model to spend more time on token outputting you will have more opportunity to emerge better reasoning in between.
At least this is how I understand it how LLMs work.
Possibly can be confirmed something with tools this : https://www.neuronpedia.org/
No comments yet
Contribute on Hacker News ↗