Comment by dnw
4 hours ago
Looks like the best way to keep improving the models is to come up with really useful benchmarks and make them popular. ARC-AGI-2 is a big jump, I'd be curious to find out how that transfers over to everyday tasks in various fields.
No comments yet
Contribute on Hacker News ↗