Comment by leetrout
2 months ago
Related check out chain of draft if you haven't.
Similar performance with 7% of tokens as chain of thought.
2 months ago
Related check out chain of draft if you haven't.
Similar performance with 7% of tokens as chain of thought.
That's a comparison to "CoT via prompting of chat models", not "CoT via training reasoning models with RLVR", so it may not apply.
This seems remarkably less safe?
Would would we want to purposely decrease interpretability?
Very strange.