← Back to context Comment by luma 6 hours ago For one, they aren't using the latest version of many of the benchmarks. eg, ARC-AGI 2 and not 3, etc. 0 comments luma Reply No comments yet Contribute on Hacker News ↗
No comments yet
Contribute on Hacker News ↗