Comment by hackernewds
3 hours ago
One would believe a model scoring this high on SWEBench could maximize F1 score for a precision recall problem easily. What's the missing part?
3 hours ago
One would believe a model scoring this high on SWEBench could maximize F1 score for a precision recall problem easily. What's the missing part?
No comments yet
Contribute on Hacker News ↗