Comment by leumon

7 days ago

He said in the video that they tested regular people (uber driver, etc.) on arc-agi2 and at least 2 people were able to solve each task (an average of 9-10 people saw each task). Also this quote from the paper: None of the self-reported demographic factors recorded for all participants—including occupation, industry, technical experience, programming proficiency, mathematical background, puzzle-solving aptitude, and var- ious other measured attributes—demonstrated clear, statistically significant relationships with performance outcomes. This finding suggests that ARC-AGI-2 tasks assess general problem-solving capabilities rather than domain-specific knowledge or specialized skills acquired through particular professional or educational experiences.