Comment by zozbot234
3 hours ago
It's picking strange tasks that don't really play to GPT-Pro's strengths (that model is roughly comparable to Mythos, intended for very hard reasoning and research-level problems) and then completely ignoring quite a few cases where GPT-Pro actually got some things more correct than DeepSeek did. The auto-AI ranking is just not reliable for this stuff.
No comments yet
Contribute on Hacker News ↗