Comment by nayroclade
7 hours ago
The models they tested are already way behind the current state-of-the-art. Would be interesting to see if their results hold up when repeated with the latest frontier models.
7 hours ago
The models they tested are already way behind the current state-of-the-art. Would be interesting to see if their results hold up when repeated with the latest frontier models.
No comments yet
Contribute on Hacker News ↗