← Back to context

Comment by numbers_guy

1 day ago

I'm confused about ARC-AGI. I thought the point of it was that you train a foundational model. Then you test it against ARC-AGI to figure out how well it reasons. Here and in some of the other reasoning papers, they are training on ARC-AGI. How much sense does that make in practice?

ARC-AGI allows (and encourages) training on their training set. Their evaluation setup is rigorous enough to avoid leaking between training and testing (public and private).