Comment by bunderbunder
2 months ago
It’s because that’s what most resembles the bulk of the tasks it was being optimized for during pre-training.
2 months ago
It’s because that’s what most resembles the bulk of the tasks it was being optimized for during pre-training.
No comments yet
Contribute on Hacker News ↗