Comment by glaslong
14 hours ago
Principal-SWE-Bench will take some time to run, because the LLM needs to wait for a crisis to present its solution, having correctly identified that the same solution would have been organizationally impossible to propose until that moment.
No comments yet
Contribute on Hacker News ↗