Comment by smallerize
2 hours ago
I think it means most of the training data is short. And a lot of the long-context examples are conversations where the middle turns are less important.
2 hours ago
I think it means most of the training data is short. And a lot of the long-context examples are conversations where the middle turns are less important.
No comments yet
Contribute on Hacker News ↗