He said in an interview that it doesn't count if it's explicitly targeted, only if a model generalizes to it.
He also said that the "real test of intelligence" is being unable to come up with new tests that a human can easily do that the AI can't, not in being able to pass any specific benchmark.
Didn't the same Francois Chollet claim that this was the Real Test of Intelligence? If they target it, perhaps they target... real intelligence?
He's always said ARC is a necessary but not sufficient condition for testing intelligence afaik
He said in an interview that it doesn't count if it's explicitly targeted, only if a model generalizes to it.
He also said that the "real test of intelligence" is being unable to come up with new tests that a human can easily do that the AI can't, not in being able to pass any specific benchmark.
I don't know what he could mean by that, as the whole idea behind ARC-AGI is to "target the benchmark." Got any links that explain further?
The fact that ARC-AGI has public and semi-private in addition to private datasets might explain it: https://arcprize.org/arc-agi/2/#dataset-structure
He should have kept it closed.