Comment by dgfl
20 hours ago
Isn’t this what AGI is by design? People CAN learn to become good at videogames. Modern LLMs can’t, they have to be retrained from scratch (I consider pre-training to be a completely different process than learning). I also don’t necessarily agree that a grandma would fail. Give her enough motivation and a couple days and she’ll manage these.
My main criticism would be that it doesn’t seem like this test allows online learning, which is what humans do (over the scale of days to years). So in practice it may still collapse to what you point out, but not because the task is unsuited to showing AGI.
What I'm saying is that this test is just another "out-of-distribution task" for an LLM. And it will be solved using the exact same methods we always use: it will end up in the pre-training data, and LLMs will crush it.
This has absolutely nothing to do with AGI. Once they beat these tests, new ones will pop up. They'll beat those, and people will invent the next batch.
The way I see it, the true formula for AGI is: [Brain] + [External Sensors] (World Receptors) + [Internal State Sensors] + [Survival Function] + [Memory].
I won't dive too deep into how each of these components has its own distinct traits and is deeply intertwined with the others (especially the survival function and memory). But on a fundamental level, my point is that we are not going to squeeze AGI out of LLMs just by throwing more tests and training cycles at them.
These current benchmarks aren't bringing us any closer to AGI. They merely prove that we've found a new layer of tasks that we simply haven't figured out how to train LLMs on yet.
P.S. A 2-year-old child is already an AGI in terms of its functional makeup and internal interaction architecture, even though they are far less equipped for survival than a kitten. The path to AGI isn't just endless task training—it's a shift toward a fundamentally different decision-making architecture.
> . Once they beat these tests, new ones will pop up. They'll beat those, and people will invent the next batch.
that's exactly the point! once we cannot invent the next batch (that is easy for humans to solve), that will be AGI
> Isn’t this what AGI is by design?
Well, the "G" in AGI is kinda important. These are specifically games/puzzles.
> they have to be retrained from scratch
Is that true? Didn't DeepMind already build plenty of agents that are generally good at most computer games without being retrained?
[dead]