← Back to context

Comment by uejfiweun

16 days ago

There is a certain logic to it though. If the scaling approaches DO get us to AGI, that's basically going to change everything, forever. And if you assume this is the case, then "our side" has to get there before our geopolitical adversaries do. Because in the long run the expected "hit" from a hostile nation developing AGI and using it to bully "our side" probably really dwarfs the "hit" we take from not developing the infrastructure you mentioned.

Any serious LLM user will tell you that there's no way to get from LLM to AGI.

These models are vast and, in many ways, clearly superhuman. But they can't venture outside their training data, not even if you hold their hand and guide them.

Try getting Suno to write a song in a new genre. Even if you tell it EXACTLY what you want, and provide it with clear examples, it won't be able to do it.

This is also why there have been zero-to-very-few new scientific discoveries made by LLM.

  • Most humans aren't making new scientific discoveries either, are they? Does that mean they don't have AGI?

    Intelligence is mostly about pattern recognition. All those model weights represent patterns, compressed and encoded. If you can find a similar pattern in a new place, perhaps you can make a new discovery.

    One problem is the patterns are static. Sooner or later, someone is going to figure out a way to give LLMs "real" memory. I'm not talking about keeping a long term context, extending it with markdown files, RAG, etc. like we do today for an individual user, but updating the underlying model weights incrementally, basically resulting in a learning, collective memory.

    • Virtually all humans of average intelligence are capable of making scientific discoveries -- admittedly minor ones -- if they devote themselves to a field, work at its frontiers, and apply themselves. They are also capable of originality in other domains, in other ways.

      I am not at all sure that the same thing is even theoretically possible for LLMs.

      Not to be facetious, but you need to spend more time playing with Suno. It really drives home how limited these models are. With text, there's a vast conceptual space that's hard to probe; it's much easier when the same structure is ported to music. The number of things it can't do absolutely outweighs the number of things it can do. Within days, even mere hours, you'll become aware of its peculiar rigidity.

  • Can most people venture outside their training data?

    • In some ways no, because to learn something you have to LEARN that then thats in the training data. But humans can do it continuously and sometimes randomly, and also being without prompted.

      1 reply →

    • Are you seriously comparing chips running AI models and human brains now???

      Last time I checked the chips are not rewiring themselves like the brain does, nor does even the software rewrite itself, or the model recalibrate itself - anything that could be called "learning", normal daily work for a human brain.

      Also, the models are not models of the world, but of our text communication only.

      Human brains start by building a model of the physical world, from age zero. Much later, on top of that foundation, more abstract ideas emerge, including language. Text, even later. And all of it on a deep layer of a physical world model.

      The LLM has none of that! It has zero depth behind the words it learned. It's like a human learning some strange symbols and the rules governing their appearance. The human will be able to reproduce valid chains of symbols following the learned rules, but they will never have any understanding of those symbols. In the human case, somebody would have to connect those symbols to their world model by telling them the "meaning" in a way they can already use. For the LLM that is not possible, since it doesn't habe such a model to begin with.

      How anyone can even entertain the idea of "AGI" based on uncomprehending symbol manipulation, where every symbol has zero depth of a physical world model, only connections to other symbols, is beyond me TBH.

      3 replies →

  • I mean yeah, but that's why there are far more research avenues these days than just pure LLMs, for instance world models. The thinking is that if LLMs can achieve near-human performance in the language domain then we must be very close to achieving human performance in the "general" domain - that's the main thesis of the current AI financial bubble (see articles like AI 2027). And if that is the case, you still want as much compute as possible, both to accelerate research and to achieve greater performance on other architectures that benefit from scaling.

    • How does scaling compute does not go hand-in-hand with energy generation? To me, scaling one and not the other puts a different set of constraints on overall growth. And the energy industry works at a different pace than these hyperscalars scaling compute.

    • The other thing here is we know the human brain learns on far less samples than LLMs in their current form. If there is any kind of learning breakthrough then the amount of compute used for learning could explode overnight

Scaling alone wont get us to AGI. We are in the latter half of this AI summer where the real research has slowed down and even stopped and the MBAs and moguls are doing stupid things

For us to take the next step towards AGI, we need an AI winter to hit and the next AI summer to start, the first half of which will produce the advancement we actually need