Comment by saberience

7 days ago

Arc-AGI (and Arc-AGI-2) is the most overhyped benchmark around though.

It's completely misnamed. It should be called useless visual puzzle benchmark 2.

It's a visual puzzle, making it way easier for humans than for models trained on text firstly. Secondly, it's not really that obvious or easy for humans to solve themselves!

So the idea that if an AI can solve "Arc-AGI" or "Arc-AGI-2" it's super smart or even "AGI" is frankly ridiculous. It's a puzzle that means nothing basically, other than the models can now solve "Arc-AGI"

7 comments

saberience

CuriouslyC 7 days ago

The puzzles are calibrated for human solve rates, but otherwise I agree.

saberience 7 days ago
My two elderly parents cannot solve Arc-AGI puzzles, but can manage to navigate the physical world, their house, garden, make meals, clean the house, use the TV, etc.
I would say they do have "general intelligence", so whatever Arc-AGI is "solving" it's definitely not "AGI"
- hmmmmmmmmmmmmmm 7 days ago
  
  You are confusing fluid intelligence with crystallised intelligence.
  
  4 replies →