← Back to context

Comment by verdverm

8 days ago

Here's a good thread over 1+ month, as each model comes out

https://bsky.app/profile/pekka.bsky.social/post/3meokmizvt22...

tl;dr - Pekka says Arc-AGI-2 is now toast as a benchmark

If you look at the problem space it is easy to see why it's toast, maybe there's intelligence in there, but hardly general.

  • the best way I've seen this describes is "spikey" intelligence, really good at some points, those make the spikes

    humans are the same way, we all have a unique spike pattern, interests and talents

    ai are effectively the same spikes across instances, if simplified. I could argue self driving vs chatbots vs world models vs game playing might constitute enough variation. I would not say the same of Gemini vs Claude vs ... (instances), that's where I see "spikey clones"

    • You can get more spiky with AIs, whereas with human brain we are more hard wired.

      So maybe we are forced to be more balanced and general whereas AI don't have to.

      6 replies →

  • > maybe there's intelligence in there, but hardly general.

    Of course. Just as our human intelligence isn't general.