Comment by xlayn
1 month ago
I updated the results, with just the Devstral part, but ran the full suite for it, and posted all the results file as well as a script to re-run the process.
The results are more spectacular...
The model pointed way better in gsm8k, but lost a bit on the other categories.
No comments yet
Contribute on Hacker News ↗