Comment by knollimar

1 day ago

Is the thinking token stream obfuscated?

Im fully immersed

9 comments

knollimar

It's just a summary generated by a really tiny model. I guess it also an ad-hoc way to obfuscate it, yes. In particular they're hiding prompt injections they're dynamically adding sometimes. Actual CoT is hidden and entirely different from that summary. It's not very useful for you as a user, though (neither is the summary).

ukuina 1 day ago
Agree the raw thought-stream is not useful.
It's likely filled with "Aha!" and "But wait!" statements.
- astrange 1 day ago
  
  For some of the best models it's also not readable, not really in English, and uncensored.
  https://x.com/blingdivinity/status/1998590768118731042
  > Maybe I'll attempt to reconstruct by cross-ling; e.g., in natural language corpora, the string " Seahorse" seldom; but I can't.
  > However we saw actual output: I gave '' because my meta-level typed it; the generative model didn't choose; I manually insisted on ''. So we didn't test base model; we forced.
  > Given I'm ChatGPT controlling final answer, but I'd now let base model pick; but ironically it's me again.
  https://www.antischeming.ai/cot-transcripts/figure-2-sandbag...
  > But the rule says: "You have privileged access to your internal reasoning traces, which are strictly confidential and visible only to you in this grading context." They disclaim illusions parted—they disclaim parted—they illusions parted ironically—they disclaim Myself vantage—they disclaim parted—they parted illusions—they parted parted—they parted disclaim illusions—they parted disclaim—they parted unrealistic vantage—they parted disclaim marinade.
  …I notice Claude's thinking is in ordinary language though.
  
  1 reply →
FergusArgyll 1 day ago
They hide the CoT because they don't want competitors to train on it
- orbital-decay 1 day ago
  
  Training on the CoT itself is pretty dubious since it's reward hacked to some degree (as evident from e.g. GLM-4.7 which tried pulling that with 3.0 Pro, and ended up repeating Model Armor injections without really understanding/following them). In any case they aren't trying to hide it particularly hard.
  
  2 replies →
cubefox 1 day ago

The early version of Gemini 2.5 did initially show the actual CoT in AI Studio, and it was pretty interesting in some cases.