Comment by jmalicki
2 months ago
Its output quite literally is not independent, as the "thinking tokens" are attended to by the attention mechanism.
2 months ago
Its output quite literally is not independent, as the "thinking tokens" are attended to by the attention mechanism.
No comments yet
Contribute on Hacker News ↗