Comment by armen52
2 months ago
I don't understand this assertion, but maybe I'm missing something?
Google included a SWE-bench score of 63.8% in their announcement for Gemini 2.5 Pro: https://blog.google/technology/google-deepmind/gemini-model-...
No comments yet
Contribute on Hacker News ↗