Comment by snemvalts

6 days ago

The ability to learn and infer without absorbing millions of books and all text on internet really does make us special. And only at 20 watts!

Last I checked humans didn't pop into existence doing that. It happened after billions of years of brute force, trial and error evolution. So well done for falling into the exact same trap the OP cautions. Intelligence from scratch requires a mind boggling amount of resources, and humans were no different.

  • To be fair, it is still pretty remarkable what the human brain does, especially in early years - there is no text embedded in the brain, just a crazily efficient mechanism to learn hierarchical systems. As far as I know, AI intelligence cannot do anything similar to this - it generally relies on giga-scaling, or finetuning tasks similar to those it already knows. Regardless of how this arose, or if it's relevant to AGI, this is still a uniqueness of sorts.

    • Human babies "train" their brain on literally gigabytes of multi-modal data dumped on them through all their sensory organs every second.

      In a very real sense, our magic superpower is that we "giga-scale" with such low resource consumption, especially considering how large (in terms of parameters) the brain is compared to even the most advanced models we have running on those thousands of GPUs today. But that's where all those millions of years of evolution pay off. Don't diss the wetware!

  • And then an 18-to-20-something-year training run is required for each individual instance.

    • I know right, such a waste. Plus it's so random on how they will turn out!

      Any suggestions on how to reduce that waste?

  • Do you think evolutionary pressures are the best explanation for why humans were able to posit the Poincaré conjecture and solve it? While our mental architecture evolved over a very long time, we still learn from miniscule amounts of data compared to LLMs.

  • How is that relevant? The human brain is at the point of birth (or some time before that). We compare that with an LLM model doing inference. The training part is irrelevant, the same way the human brains' evolution is.

We have a tremendous amount of raw information flowing through our brains 24/7 from before we are born, from the external world through all our senses and from within our minds as it attempts to make sense of that information, make predictions, generally reason about our existence, hallucinate alternative realities, etc. etc.

If you were able to somehow capture all that information in full detail as you've had access to by the age of say 25, it would likely dwarf the amount of information in millions of books by several orders of magnitude.

When you are 25 years old and are presented a strange looking ball and told to throw it into a strange looking basket for the first time. You are relying on an unfathomable amount of information turned into knowledge and countless prior experiments that you've accumulated/exercised to that point relating to the way your body and the world works.

  • Humans are "multi-modal". Sure we get plenty of non-textual information, but LLMs were trained on basically every human-written world ever. They definitely see many orders of magnitude more language than any human has ever seen. And yet humans get fluent based after 3+ years.

    • If you treat the human brain as a model, and account for the full complexity of neurons (one neuron != one parameter!) it has several orders of magnitude more parameters than any LLM we've made to date, so it shouldn't come as a surprise.

      What is surprising is that our brain, as complex as it is, can train so fast on such a meager energy budget.

      1 reply →

    • For sure, it seems like there's something there primed to pick up human language quickly, clearly evolutionarily driven.

      Not necessarily so for the dynamics of magnetic fields, or nonhuman animal communications, or dark energy/matter.

      We are bombarded nonstop by magnetic fields, nonhuman animal communications, and live in a universe which seems to be majority dominated by dark energy and matter, and yet understand little to none of it all.

20 watts ignores the startup cost: Tens of millions of calories. Hundreds of thousands of gallons of water. Substantial resources from at least one other human for several years.

Just an interesting thought experiment: if you took all the sensory information that a child experiences through their senses (sight, hearing, smell, touch, taste) between, say, birth and age five, how many books worth of data would that be? I asked Claude, and their estimate was about 200 million books. Maybe that number is off ± by an order of magnitude. ...but then again Claude is only three years old, not five.

To be fair, the knowledge embedded in an LLM is also, at this point, a couple orders of magnitude (at least) larger than what the average human being can retain. So it's not like all those books and text in the internet are used just to bring them to our level, they go way beyond.

Now multiply that with 7 billion to distill that one who will solve frontier math problem.

Most people have absorbed way too few books to be able to infer properly. Hell, most people are confused by TV remotes.