Comment by bermudi
17 hours ago
Source? The most trusted benchmark right now (deepSWE) scores better or just as well on their minimal harness than when using CC or codex
17 hours ago
Source? The most trusted benchmark right now (deepSWE) scores better or just as well on their minimal harness than when using CC or codex
No comments yet
Contribute on Hacker News ↗