← Back to context

Comment by Marha01

6 hours ago

> a lot of people seem to see LLMs as smarter than themselves

Well, in many cases they might be right..

As far as I can tell from poking people on HN about what "AGI" means, there might be a general belief that the median human is not intelligent. Given that the current batch of models apparently isn't AGI I'm struggling to see a clean test of what AGI might be that a human can pass.

  • LLMs may appear to do well on certain programming tasks on which they are trained intensively, but they are incredibly weak. If you try to use an LLM to generate, for example, a story, you will find that it will make unimaginable mistakes. If you ask an LLM to analyze a conversation from the internet it will misrepresent the positions of the participants, often restating things so that they mean something different or making mistakes about who said what in a way that humans never do. The longer the exchange the more these problems are exacerbated.

    We are incredibly far from AGI.

    • > We are incredibly far from AGI.

      This and we don't actually know what the foundation models are for AGI, we're just assuming LLMs are it.

    • We do have AI systems that write stories [0]. They work. The quality might not be spectacular but if you've ever gone out and spent time reading fanfiction you'd have to agree there are a lot of rather terrible human writers too (bless them). It still hits this issue that if we want LLMs to compete with the best of humanity then they aren't there yet, but that means defining human intelligence as something that most people don't have access to.

      > If you ask an LLM to analyze a conversation from the internet it will misrepresent the positions of the participants, often restating things so that they mean something different or making mistakes about who said what in a way that humans never do.

      AI transcription & summary seems to be a strong point of the models so I don't know what exactly you're trying to get to with this one. If you have evidence for that I'd actually be quite interested because humans are so bad at representing what other people said on the internet it seems like it should be an easy win for an AI. Humans typically have some wild interpretations of what other people write that cannot be supported from what was written.

      [0] https://github.com/google-deepmind/dramatron

      2 replies →

    • This seems distant from my experience. Modern LLMs are superb at summarisation, far better than most people.

  • > there might be a general belief that the median human is not intelligent

    This is to deconstruct the question.

    I don't think it's even wrong - a lot of people are doing things, making decisions, living life perfectly normally, successfully even, without applying intelligence in a personal way. Those with socially accredited 'intelligence' would be the worst offenders imo - they do not apply their intelligence personally but simply massage themselves and others towards consensus. Which is ultimately materially beneficial to them - so why not?

    For me 'intelligence' would be knowing why you are doing what you are doing without dismissing the question with reference to 'convention', 'consensus', someone/something else. Computers can only do an imitation of this sort of answer. People stand a chance of answering it.

  • Being an intelligent being is not the same as being considered intelligent relative to the rest of your species. I think we’re just looking to create an intelligence, meaning, having the attributes that make a being intelligent, which mostly are the ability to reason and learn. I think the being might take over from there no?

    With humans, the speed and ease with which we learn and reason is capped. I think a very dumb intelligence with stay dumb for not very long because every resource will be spent in making it smarter.

    • > every resource will be spent in making it smarter

      The root motivation on which every resource will be spent is simply and very obviously to make a profit.

> ChatGPT (o3): Scored 136 on the Mensa Norway test in April 2025

So yes, most people are right in that assumption, at least by the metric of how we generally measure intelligence.

  • Does an LLM scoring well on the Mensa test translate to it doing excellent and factual police reporting? It is probably not true of humans doing well on the Mensa, why would it be true of an LLM?

    We should probably rigorously verify that, for a role that itself is about rigorous verification without reasonable doubt.

    I can immediately, and reasonably, doubt the output of an LLM, pending verification.

  • Yeah I certainly associate LLMs with high intelligence when they provide fake links to fake information, I think, man this thing is SMART

  • Court reports should as much be about human sensibility. I have met plenty of high IQ people who were insensitive.

    • Having listened to some the new AI generated songs on utube, looks like they might be better at being sensitive humans than we are as well..

      1 reply →