Comment by Me1000
2 years ago
Not OP and have no insight, but the thing that caused it to click for me was when I heard “this token attends to that token”. Basically, there’s a new value created that represents how much one thing (in an LLM its tokens) cares about another thing.
Saying “attends to” vs “attention” helped clarify (for me) the mechanics of what’s going on.
No comments yet
Contribute on Hacker News ↗