Comment by numbers_guy
1 day ago
I'm confused about ARC-AGI. I thought the point of it was that you train a foundational model. Then you test it against ARC-AGI to figure out how well it reasons. Here and in some of the other reasoning papers, they are training on ARC-AGI. How much sense does that make in practice?
ARC-AGI allows (and encourages) training on their training set. Their evaluation setup is rigorous enough to avoid leaking between training and testing (public and private).