Comment by mathiaspoint

10 days ago

I really hate the thinking. I do my best to disable it but don't always remember. So often it just gets into a loop second guessing itself until it hits the token limit. It's rare it figures anything out while it's thinking too but maybe that's because I'm better at writing prompts.

9 comments

mathiaspoint

thomashop 10 days ago

I have the impression that the thinking helps even if the actual content of the thinking output is nonsense. It awards more cycles to the model to think about the problem.

wat10000 10 days ago
That would be strange. There's no hidden memory or data channel, the "thinking" output is all the model receives afterwards. If it's all nonsense, then nonsense is all it gets. I wouldn't be completely surprised if a context with a bunch of apparent nonsense still helps somehow, LLMs are weird, but it would be odd.
- barrkel 10 days ago
  
  This isn't quite right. Even when an LLM generates meaningless tokens, its internal state continues to evolve. Each new token triggers a fresh pass through the network, with attention over the KV cache, allowing the model to refine its contextual representation. The specific tokens may be gibberish, but the underlying computation can still reflect ongoing "thinking".
- yorwba 10 days ago
  
  Attention operates entirely on hidden memory, in the sense that it usually isn't exposed to the end user. An attention head on one thinking token can attend to one thing and the same attention head on the next thinking token can attend to something entirely different, and the next layer can combine the two values, maybe on the second thinking token, maybe much later. So even nonsense filler can create space for intermediate computation to happen.
- Wowfunhappy 10 days ago
  
  Wasn't there some study that just telling the LLM to write a bunch of periods first improves responses?
  
  1 reply →
- mathiaspoint 10 days ago
  
  Eh. The embeddings themselves could act like hidden layer activations and encode some useful information.

bigbuppo 10 days ago

It's almost like there's an incentive for them to burn as many tokens as possible accomplishing nothing useful.

jgalt212 10 days ago

I hate thinking mode because I prefer a mostly right answer right now over having to wait for a probably better, but still not exactly right answer.