Comment by hackernewds
4 hours ago
One would believe a model scoring this high on SWEBench could maximize F1 score for a precision recall problem easily. What's the missing part?
4 hours ago
One would believe a model scoring this high on SWEBench could maximize F1 score for a precision recall problem easily. What's the missing part?
No comments yet
Contribute on Hacker News ↗