Comment by Bjorkbat
5 months ago
Yeah, I saw the pass@7 figure as well, and I'm not sure what to make of it. On the one hand, solving nearly half of all tasks is impressive. On the other hand, a machine that might do something correctly if you give it 7 attempts isn't particularly enjoyable to use.
No comments yet
Contribute on Hacker News ↗