Comment by alex43578

1 month ago

Isn’t that the point of the hidden chain of thought tokens, rather than the visible cruft?

I think the fluff, the emojis, the sycophancy is all symptomatic of the training process and human feedback.

I thought PP was saying that the "Thinking" text is only used for one turn, and the response text is the compressed thinking that survives into future turns.