Comment by ben_w

7 days ago

You're not alone in this; I expect us to have not yet enumerated all the things that we ourselves mean by "intelligence".

But conversely, not passing this test is a proof of not being as general as a human's intelligence.

I find the "what is intelligence?" discussion a little pointless if I'm honest. It's similar to asking a question like does it mean to be a "good person" and would we know whether an AI or person is really "good"?

While understanding why a person or AI is doing what it's doing can be important (perhaps specifically in safety contexts) at the end of the day all that's really going to matter to most people is the outcomes.

So if an AI can use what appears to be intelligence to solve general problems and can act in ways that are broadly good for society, whether or not it meets some philosophical definition of "intelligent" or "good" doesn't matter much – at least in most contexts.

That said, my own opinion on this is that the truth is likely in between. LLMs today seem extremely good at being glorified auto-completes, and I suspect most (95%+) of what they do is just recalling patterns in their weights. But unlike traditional auto-completes they do seem to have some ability to reason and solve truly novel problems. As it stands I'd argue that ability is fairly poor, but this might only represent 1-2% of what we use intelligence for.

If I were to guess why this is I suspect it's not that LLM architecture today is completely wrong, but that the way LLMs are trained means that in general knowledge recall is rewarded more than reasoning. This is similar to the trade-off we humans have with education – do you prioritise the acquisition of knowledge or critical thinking? Maybe believe critical thinking is more important and should be prioritised more, but I suspect for the vast majority of tasks we're interested in solving knowledge storage and recall is actually more important.

  • That's certainly a valid way of looking at their abilities at any given task — "The question of whether a computer can think is no more interesting than the question of whether a submarine can swim".

    But when the question is "are they going to more important to the economy than humans?", then they have to be good at basically everything a human can do, otherwise we just see a variant of Amdahl's law in action and the AI perform an arbitrary speed-up of n % of the economy while humans are needed for the remaining 100-n %.

    I may be wrong, but it seems to me that the ARC prize is more about the latter.

    • > are they going to more important to the economy than humans?", then they have to be good at basically everything a human can do,

      I really don’t think that’s the case. A robot that can stack shelves faster than a human is more valuable at that job than someone who can move items and also appreciate comedy. One that can write software more reliably than person X is more valuable than them at that job even if X is well rounded and can do cryptic crosswords and play the guitar.

      Also many tasks they can be worse but cheaper.

      I do wonder how many tasks something like o3 or o3 pro can’t do as well as a median employee.

      2 replies →