← Back to context

Comment by batshit_beaver

4 hours ago

Examples:

https://arxiv.org/html/2506.02878v1

https://arxiv.org/pdf/2508.01191

Anthropic themselves: https://www.anthropic.com/research/reasoning-models-dont-say...

They were approaching this from an interpretability standpoint, but the more interesting finding in there is that models come up with an answer that fits their training and context provided. CoT is generated to fit the anticipated answer.

In these studies, there are examples of CoT that directly contradicts the response these models ultimately settle on.

This is not reasoning. This is pretense.

This is just a no-true-Scotsman defense of reasoning. We were talking about inferring intent.

If someone recorded the inner monologue of human decision-making, would it look like a logician’s workbook? No, I don’t think it would. People like to pretend they are rational.