Comment by MoonGhost
3 months ago
I think the problem is with positional encoding. If model cannot clearly separate tokens in context window they overlap which leads to mess. That encoding matters and actual position does not.
3 months ago
I think the problem is with positional encoding. If model cannot clearly separate tokens in context window they overlap which leads to mess. That encoding matters and actual position does not.
No comments yet
Contribute on Hacker News ↗