Comment by kortex

13 hours ago

Humans can't reliably subitize more than five-ish objects, while chimps can actually do this task better than us. That's our "cant count the R's in strawberry" (which flagship models can reliably do now, general letter counting).

https://en.wikipedia.org/wiki/Subitizing

2 comments

kortex

acdha 5 hours ago

That’s not a valid analogy: humans reliably perform that task billions of times daily. It’s still routine to find cases which reveal that while models may have improved on some basic tasks (or learned to call a tool) there isn’t a deeper understanding of the underlying concept or the problem they’re being asked to solve.

kortex 8 minutes ago

And AI agents reliably-ish do tasks billions of times a day that humans struggle with, namely regurgitating information at incredible rates across wide breadths of topics. I see it as merely a matter of degree, not category.
How do you measure "deeper understanding" in humans? You usually do it by asking them to show their work, show how the dots connect. Reasoning models are getting there, and when they do, I'm sure the goalposts will move yet again.