Comment by ummonk

7 days ago

"Being able to solve ARC problems is probably a pre-requisite to AGI." - is it? Humans have general intelligence and most can't solve the harder ARC problems.

6 comments

ummonk

singron 7 days ago

https://arcprize.org/leaderboard

"Avg. Mturker" has 77% on ARC1 and costs $3/task. "Stem Grad" has 98% on ARC1 and costs $10/task. I would love a segment like "typical US office employee" or something else in between since I don't think you need a stem degree to do better than 77%.

It's also worth noting the "Human Panel" gets 100% on ARC2 at $17/task. All the "Human" models are on the score/cost frontier and exceptional in their score range although too expensive to win the prize obviously.

I think the real argument is that the ARC problems are too abstract and obscure to be relevant to useful AGI, but I think we need a little flexibility in that area so we can have tests that can be objectively and mechanically graded. E.g. "write a NYT bestseller" is an impractical test in many ways even if it's closer to what AGI should be.

tbrownaw 7 days ago

> I think the real argument is that the ARC problems are too abstract and obscure to be relevant to useful AGI
I think it's meant to work like how getting things off the top shelf at the supermarket isn't relevant to playing basketball.

adastra22 7 days ago

They, and the other posters posting similar things, don't mean human-like intelligence, or even the rigorously defined solving of unconstrained problem spaces that originally defined Artificial General Intelligence (in contrast to "narrow" intelligence").

They mean an artificial god, and it has become a god of the gaps: we have made artificial general intelligence, and it is more human-like than god-like, and so to make a god we must have it do XYZ precisely because that is something which people can't do.

ummonk 7 days ago
Right, but there is a very clear term for that which they should be using: ASI
- adastra22 7 days ago
  
  Agree, and you shouldn't be downvoted. It is a pet peeve of me as well.

satellite2 7 days ago

Didn't he say that 70% in a random sample of the population should get it right?