← Back to context

Comment by lm28469

15 hours ago

But wait they're just about to get AGI why would he leave???

LeCun always said that LLMs do not lead to AGI.

  • Can anyone explain to me the non-$$ logic for one working towards AGI, aside from misanthropy?

    The only other thing I can imagine is not very charitable: intellectual greed.

    It can't just be that, can it? I genuinely don't understand. I would love to be educated.

    • Well, AGI could accelerate scientific and medical discovery, saving lives and impacting billions of people positively.

      The potential downside is admittedly severe.

      2 replies →

    • I'm a true believer in AGI being able to become a force for immense good if deployed carefully by responsible parties.

      Currently one of the key issues with a lot of fields is that they operate as independent / largely isolated silos. If you could build a true AGI capable of achieving top-level mastery across multiple disciplines it would likely be able to integrate all that knowledge and make a lot of significant discoveries that would improve people's lives. Just exploring existing problem spaces with the full intellectual toolkit that humanity has developed is probably enough to make significant progress.

      Our understanding of biology is still painfully primitive. To give a concrete example, I dream that someday it'll be possible to develop medical interventions that allow humans to regrow missing limbs and fix almost any health issue.

      Have you ever lived with depression or any other psychiatric problem? I think if we could create medical interventions and environments that are conductive towards healing psychiatric problems, that would also be a massive quality of life improvement for huge numbers of people. Do you know how our current psychiatric interventions work? You try some drug, flip a coin to see if it does anything and wait 4 weeks to get the result. Then you keep iterating and hope that eventually the doctor finds some magical combination to make life barely tolerable.

      I think the best path forward for improving humanity's understanding of biology, and ultimately medical science, is to go all-in on AGI-style technology.

    • Trying to engage in good faith here but I don't really get this. You're pretending to have never encountered positive visions of technologically advanced futures.

      Cure all disease?

      Stop aging?

      End material scarcity?

      It's completely fair to expect that these are all twisted monkey's paw scenarios that turn out dystopian, but being unable to understand any positive motivations for the creation of AGI seems a bit far fetched.

      1 reply →

    • R&D can be automated to speed up medical research - saving lives, prolonging life, etc.

      Assistant robots for the elderly. In many countries population is shrinking, so fundamentally just not enough people to take care of the old.

    • Have you ever seen that "science advocate vs scientist" comic?

      https://www.smbc-comics.com/?id=2088

      It's true. When it comes to the people doing bleeding edge research and development, the answer often is "BECAUSE IT'S FUCKING AWESOME". Regardless of what they tell the corporate higher-ups or put on the grant application statements.

      Sure, a lot of people believe that AGI is going to make the world a better place. But "mad scientist" is a stereotype for a reason. You look into their eyes and you see the flame of madness flickering behind them.

  • He also said other things about LLMs that turned out to be either wrong or easily bypassed with some glue. While I understand where he comes from, and that his stance is pure research-y theory driven, at the end of the day his positions were wrong.

    Previously, he very publicly and strongly said:

    a) LLMs can't do math. They trick us in poetry but that's subjective. They can't do objective math.

    b) they can't plan

    c) by the very nature of autoregressive arch, errors compound. So the longer you go in your generation, the higher the error rate. so at long contexts the answers become utter garbage.

    All of these were proven wrong, 1-2 years later. "a" at the core (gold at IMO), "b" w/ software glue and "c" with better training regimes.

    I'm not interested in the will it won't it debates about AGI, I'm happy with what we have now, and I think these things are good enough now, for several usecases. But it's important to note when people making strong claims get them wrong. Again, I think I get where he's coming from, but the public stances aren't the place to get into the deep research minutia.

    That being said, I hope he gets to find whatever it is that he's looking for, and wish him success in his endeavours. Between him, Fei Fei Li and Ilya, something cool has to come out of the small shops. Heck, I'm even rooting for the "let's commoditise lora training" that Mira's startup seems to go for.

    • a) Still true: vanilla LLMs can’t do math, they pattern-match unless you bolt on tools.

      b) Still true: next-token prediction isn’t planning.

      c) Still true: error accumulation is mitigated, not eliminated. Long-context quality still relies on retrieval, checks, and verifiers.

      Yann’s claims were about LLMs as LLMs. With tooling, you can work around limits, but the core point stands.

      11 replies →

    • That's true but I also think despite being wrong about the capabilities of LLMs, LeCun has been right in that variations of LLMs are not an appropriate target for long term research that aims to significantly advance AI. Especially at the level of Meta.

      I think transformers have been proven to be general purpose, but that doesn't mean that we can't use new fundamental approaches.

      To me it's obvious that researchers are acting like sheep as they always do. He's trying to come up with a real innovation.

      LeCun has seen how new paradigms have taken over. Variations of LLMs are not the type of new paradigm that serious researches should be aiming for.

      I wonder if there can be a unification of spatial-temporal representations and language. I am guessing diffusion video generators already achieve this in some way. But I wonder if new techniques can improve the efficiency and capabilities.

      I assume the Nested Learning stuff is pretty relevant.

      Although I've never totally grokked transformers and LLMs, I always felt that MoE was the right direction and besides having a strong mapping or unified view of spatial and language info, there also should somehow be the capability of representing information in a non-sequential way. We really use sequences because we can only speak or hear one sound at a time. Information in general isn't particularly sequential, so I doubt that's an ideal representation.

      So I guess I am kind of variations of transformers myself to be honest.

      But besides being able to convert between sequential discrete representations and less discrete non-sequential representations (maybe you have tokens but every token has a scalar attached), there should be lots of tokenizations, maybe for each expert. Then you have experts that specialize in combining and translating between different scalar-token tokenizations.

      Like automatically clustering problems or world model artifacts or something and automatically encoding DSLs for each sub problem.

      I wish I really understood machine learning.

    • > So the longer you go in your generation, the higher the error rate. so at long contexts the answers become utter garbage.

      Not totally wrong. They can self-correct, but it seems context rot will eventually set in.