Comment by ksec

3 months ago

Interesting this is released literally one hour after another discussions suggesting Meta ( https://news.ycombinator.com/item?id=43562768 )

>at this point it does not matter what you believe about LLMs: in general, to trust LeCun words is not a good idea. Add to this that LeCun is directing an AI lab that as the same point has the following huge issues:

1. Weakest ever LLM among the big labs with similar resources (and smaller resources: DeepSeek).

2. They say they are focusing on open source models, but the license is among the less open than the available open weight models.

3. LLMs and in general all the new AI wave puts CNNs, a field where LeCun worked (but that didn't started himself) a lot more in perspective, and now it's just a chapter in a book that is composed mostly of other techniques.

Would be interesting to see opinion of antirez on this new release.

24 comments

ksec

sshh12 3 months ago

Not that I agree with all the linked points but it is weird to me that LeCun consistently states LLMs are not the right path yet LLMs are still the main flagship model they are shipping.

Although maybe he's using an odd definition for what counts as a LLM.

https://www.threads.net/@yannlecun/post/DD0ac1_v7Ij?hl=en

ezst 3 months ago
> LeCun consistently states LLMs are not the right path yet LLMs are still the main flagship model they are shipping.
I really don't see what's controversial about this. If that's to mean that LLMs are inherently flawed/limited and just represent a local maxima in the overall journey towards developing better AI techniques, I thought that was pretty universal understanding by now.
- singularity2001 3 months ago
  
  local maximum that keeps rising and no bar/boundary in sight
  
  1 reply →
phren0logy 3 months ago
That is how I read it. Transformer based LLMs have limitations that are fundamental to the technology. It does not seem crazy to me that a guy involved in research at his level would say that they are a stepping stone to something better.
What I find most interesting is his estimate of five years, which is soon enough that I would guess he sees one or more potential successors.
- kadushka 3 months ago
  
  In our field (AI) nobody can see even 5 months ahead, including people who are training a model today to be released 5 months from now. Predicting something 5 years from now is about as accurate as predicting something 100 years from now.
  
  4 replies →
AIPedant 3 months ago

[dead]

falcor84 3 months ago

I don't understand what LeCun is trying to say. Why does he give an interview saying that LLM's are almost obsolete just when they're about to release a model that increases the SotA context length by an order of magnitude? It's almost like a Dr. Jekyll and Mr. Hyde situation.

martythemaniak 3 months ago
LeCun fundamentally doesn't think bigger and better LLMs will lead to anything resembling "AGI", although he thinks they may be some component of AGI. Also, he leads the research division, increasing context length from 2M to 10M is not interesting to him.
- sroussey 3 months ago
  
  He thinks LLMs are a local maxima, not the ultimate one.
  Doesn't mean that a local maxima can't be useful!
  
  2 replies →
- falcor84 3 months ago
  
  But ... that's not how science works. There are a myriad examples of engineering advances pushing basic science forward. I just can't understand why he'd have such a "fixed mindset" about a field where the engineering is advancing an order of magnitude every year
  
  4 replies →
charcircuit 3 months ago

A company can do R&D into new approaches while optimizing and iterating upon an existing approach.

joaogui1 3 months ago

I mean they're not comparing with Gemini 2.5, or the o-series of models, so not sure they're really beating the first point (and their best model is not even released yet)

Is the new license different? Or is it still failing for the same issues pointed by the second point?

I think the problem with the 3rd point is that LeCun is not leading LLama, right? So this doesn't change things, thought mostly because it wasn't a good consideration before

Melklington 3 months ago

LeCun doesn't believe in LLM Architecture anyway.

Could easily be that he just researches bleeding edge with his team and others work on Llama + doing experiements with new technices on it.

Any blog post or yt docu going into detail how they work?