Comment by orbital-decay

1 day ago

Training on the CoT itself is pretty dubious since it's reward hacked to some degree (as evident from e.g. GLM-4.7 which tried pulling that with 3.0 Pro, and ended up repeating Model Armor injections without really understanding/following them). In any case they aren't trying to hide it particularly hard.

2 comments

orbital-decay

FergusArgyll 1 day ago

> In any case they aren't trying to hide it particularly hard.

What does that mean? Are you able to read the raw cot? how?

SXX 16 hours ago

My guess they mean Google create those summaries via tool use and not trying to filter actual chain of thoughts on API level or return errors if model start leaking it.
If you work with big contexts in AI Studio (like 600,000-900,000 tokens) it sometimes just breaks downs on its own and starts returning raw cot without any prompt hacking whatsoever.
I believe if you intentionally try to expose it that would be pretty easy to achieve.