Comment by MoonGhost
14 days ago
I think the problem is with positional encoding. If model cannot clearly separate tokens in context window they overlap which leads to mess. That encoding matters and actual position does not.
14 days ago
I think the problem is with positional encoding. If model cannot clearly separate tokens in context window they overlap which leads to mess. That encoding matters and actual position does not.
No comments yet
Contribute on Hacker News ↗