Comment by D-Machine
18 days ago
ARC-AGI 2 private test set is one current bar that a large number of people find important and will be convincing to a large amount of people again if LLMs start doing really well on it. Performance degradation on the private set is still huge though and far inferior to human performance.
No comments yet
Contribute on Hacker News ↗