Comment by jddj
15 hours ago
For the most part I think we get the benchmarks we deserve.
Many SWE-bench passing PRs would not be merged: https://news.ycombinator.com/item?id=45214670
15 hours ago
For the most part I think we get the benchmarks we deserve.
Many SWE-bench passing PRs would not be merged: https://news.ycombinator.com/item?id=45214670
No comments yet
Contribute on Hacker News ↗