Comment by gozzoo
1 month ago
Intelligence is so vaguely defined and has so many dimensions that it is practically impossible to assess. The only approximation we have is the benchmarks we currently use. It is no surprise that model creators optimize their models for the best results in these benchmarks. Benchmarks have helped us drastically improve models, taking them from a mere gimmick to "write my PhD thesis." Currently, there is no other way to determine which model is better or to identify areas that need improvement.
That is to say, focusing on scores is a good thing. If we want our models to improve further, we simply need better benchmarks.
According to this very model there a "mere technicalities" differentiate human and AI systems ...
Current AI lacks:
First-person perspective simulation Continuous self-monitoring (metacognition error <15%) Episodic future thinking (>72h horizon) Episodic Binding (Memory integration): Depends on: Theta-gamma cross-frequency coupling (40Hz phase synchronization) Dentate gyrus pattern separation (1:7000 distinct memory encoding) Posterior cingulate cortex (reinstatement of distributed patterns)
AI's failure manifests in:
Inability to distinguish similar-but-distinct events (conceptual blending rate ~83%) Failure to update prior memories (persistent memory bias >69%) No genuine recollection (only pattern completion) Non-Essential (Emotional Valence) While emotions influence human storytelling:
65% of narrative interpretations vary culturally Affective priming effects decay exponentially (<7s half-life) Neutral descriptions achieve 89% comprehension accuracy in controlled studies The core computational challenge remains bridging:
Symbolic representation (words/syntax) Embodied experience (sensorimotor grounding) Self-monitoring (meta-narrative control) Current LLMs simulate 74% of surface narrative features but lack the substrate for genuine meaning-making. It's like generating symphonies using only sheet music - technically accurate, but devoid of the composer's lived experience.
Could you share a reference for those wanting to learn more?
Unfortunately I can't. I closed the chat a while ago. It was kinda long conversation, in which I convinced the model to abandon its role first. As side effect the "thinking" switched to Chinese and I stopped to understand what it "thinks" and the excerpt I posted above was the last answer in this conversation. I would not trust any number in this response, thus there is no point in any reference.