Comment by comp_raccoon
3 months ago
Olmo author here, but I can help! First release of Qwen 3 left a lot of performance on the table bc they had some challenges balancing thinking and non-thinking modes. VL series has refreshed posttrain, so they are much better!
No comments yet
Contribute on Hacker News ↗