Comment by atdt
14 days ago
The level of intellectual engagement with Chomsky's ideas in the comments here is shockingly low. Surely, we are capable of holding these two thoughts: one, that the facility of LLMs is fantastic and useful, and two, that the major breakthroughs of AI this decade have not, at least so far, substantially deepened our understanding of our own intelligence and its constitution.
That may change, particularly if the intelligence of LLMs proves to be analogous to our own in some deep way—a point that is still very much undecided. However, if the similarities are there, so is the potential for knowledge. We have a complete mechanical understanding of LLMs and can pry apart their structure, which we cannot yet do with the brain. And some of the smartest people in the world are engaged in making LLMs smaller and more efficient; it seems possible that the push for miniaturization will rediscover some tricks also discovered by the blind watchmaker. But these things are not a given.
> AI this decade have not, at least so far, substantially deepened our understanding of our own intelligence and its constitution
I would push back on this a little bit. While it has not helped us to understand our own intelligence, it has made me question whether such a thing even exists. Perhaps there are no simple and beautiful natural laws, like those that exists in Physics, that can explain how humans think and make decisions. When CNNs learned to recognize faces through a series of hierarchical abstractions that make intuitive sense it's hard to deny the similarities to what we're doing as humans. Perhaps it's all just emergent properties of some messy evolved substrate.
The big lesson from the AI development in the last 10 years from me has been "I guess humans really aren't so special after all" which is similar to what we've been through with Physics. Theories often made the mistake of giving human observers some kind of special importance, which was later discovered to be the cause of theories not generalizing.
> The big lesson from the AI development in the last 10 years from me has been "I guess humans really aren't so special after all"
Instead I would take the opposite take.
How wonderful is it, that with naturally evolved processes and neural structures, have we been able to create what we have. Van Gogh’s paintings came out of the human brain. The Queens of the Skies - hundreds of tons of metal and composites - flying across continents in the form of a Boeing 747 or an A380 - was designed by the human brain. We went to space, have studied nature (and have conservation programs for organisms we have found to need help), took pictures the pillars of creation that are so incredibly far… all with such a “puny” structure a few cm in diameter? I think that’s freaking amazing.
"I guess humans really aren't so special after all"
This is a crazy take to me. As compared to what? The machines that we built?
Until we discover comparably intelligent life in the universe I think it's fair to say that we are indeed very special.
8 replies →
"Brain_s_". I find we (me included) generally overlook/underestimate the distributed nature of human intelligence, included in the AI field. That's why when I first heard of mixture of experts I was thrilled about the idea and the potential. (One could also see similarities in random forest). I believe a path to AGI(tm) would be to reproduce the evolution of human intelligence artificially. Start with small models training bigger and bigger models and let the bigger successfull models (insert RL, genetic algos, etc.) "reproduce" and teach newer models from scratch. Having different model architecture cohabit could maybe even lead to the kind of specializations we see in parts of the brain
I think it is important to realize, that we need to understand language on our own terms. The logic of LLMs is not unlike alien technology to us. That being said, the minimalist program of Chomsky lead to nowhere, because just like programming, it found edge case after edge case, reducing it further and further, until there was no program anymore that resembled a real theory. But it is wrong to assume that the big progress in linguistics is in vain, the same reason Prolog, Theorem provers, type theory, category theory is in vain, when we have LLMs that can produce everything in C++. We can use the technology of linguistics to ground our knowledge, and in some dark corner of the LLM it might already have integrated this. I think the original divide between the sciences and the humanities might be deeper and more fundamental than we think. We need linguistic as a discipline of the humanities, and maybe huge swaths of Computer Science is just that.
I agree with you. I think the fundamental problem is we don't have a good unified theory of fuzzy reasoning. We have a lot of different formal approaches but they all have flaws.
Now LLMs made a big breakthrough that they showed we can do decent fuzzy reasoning in practice. But at the cost of nobody understanding the underlying process formally.
If we had a good unified (formal) theory of fuzzy reasoning, we could build models that reason better (or at least more predictably). But we won't get a better theory by scaling the existing models, I think Chomsky is right about that.
We lack the goal, not the means. If I am asking LLM a question, what answer do I want? A playfully creative one? A strictly logical one? A pleasingly sycophantic one? A harshly critical one? An out of the box devil's advocate one? A beautiful one? A practical one? We have no clue how to express these modes in logical reasoning.
By way of analogy, the result of the theorem prover is usually actionable (i.e. we can replace one kind of expression with its proven equivalent for some end like optimizing code-size or code-run-time), but mathematicians _still_ endeavor to translate the unwieldy and verbose machine-generated proofs into concise human-readable proofs, because those readable proofs are useful to our understanding of mathematics even long after the "productive action" has been taken.
In a way, this collaboration between the machine and the human is better than what came before, because now productive actions can be taken sooner, and mathematicians do not have to doubt whether they are searching for a proof that exists.
>That being said, the minimalist program of Chomsky lead to nowhere, because just like programming, it found edge case after edge case, reducing it further and further, until there was no program anymore that resembled a real theory
As someone who has worked in linguistics, I don't really see what you're talking about. Minimalism is not full of exceptions (please elaborate on a specific example if you have one). Minimalism was created to make the old theory, Government and Binding, simpler.
1 reply →
but we don't have llms that can "produce everything in c++".
We have LLMs that can get some boilerplate right if you use it in a greenfield project, and will repeatedly mess up your code once it grows enough for you to actually need assistance grokking it.
> Perhaps there are no simple and beautiful natural laws, like those that exists in Physics, that can explain how humans think and make decisions.
Isn't Physics trying to describe the natural world? I'm guessing you are taking two positions here that are causing me confusion with your statement: 1) that our minds can be explained strictly through physical processes, and 2) our minds, including our intelligence, are outside of the domain of Physics.
If you take 1) to be true, then it follows that Physics, at least theoretically, should be able to explain intelligence. It may be intractably hard, like it might be intractably hard to have physics decribe and predict the motions of more than two planetary bodies.
I guess I'm saying that Physical laws ARE natural laws. I think you might be thinking that natural laws refer solely to all that messy, living stuff.
I think their emphasis is on simple and beautiful; not that human intelligence is outside the laws of physics, but that there will never be a “Maxwell’s equations” modelling the workings of human intelligence, it will just be a big pile of hacks and complex interactions of many distinct parts; nothing like the couple of recursive LISP macros people of the 1960s might have hoped to find.
Neuroscientist here:
> Perhaps there are no simple and beautiful natural laws, like those that exists in Physics, that can explain how humans think and make decisions...Perhaps it's all just emergent properties of some messy evolved substrate.
Yeah, it is very likely that there are not laws that will do this, it's the substrate. The fruit fly brain (let alone human) has been mapped, and we've figured out that it's not just the synapse count, but the 'weights' that matter too [0]. Mind you, those weights adjust in real time when a living animal is out there.
You'll see in literature that there are people with some 'lucky' form of hydranencephaly where their brain is as thin as paper. But they vote, get married, have kids, and for some strange reason seem to work in mailrooms (not a joke). So we know it's something about the connectome that's the 'magic' of a human.
My pet theory: We need memristors [2] to better represent things. But that takes redesigning the computer from the metal on up, so is unlikely to occur any time soon with this current AI craze.
> The big lesson from the AI development in the last 10 years from me has been "I guess humans really aren't so special after all" which is similar to what we've been through with Physics.
Yeah, biologists get there too, just the other way abouts, with animals and humans. Like, dogs make vitamin C internally, and humans have that gene too, it's just dormant, ready for evolution (or genetic engineering) to reactivate. That said, these neuroscience issues with us and the other great apes are somewhat large and strange. I'm not big into that literature, but from what little I know, the exact mechanisms and processes that get you from tool using ourangs to tool using humans, well, those seem to be a bit strange and harder to grasp for us. Again, not in that field though.
In the end though, humans are special. We're the only ones on the planet that ever really asked a question. There's a lot to us and we're actually pretty strange in the end. There's many centuries of work to do with biology, we're just at the wading stage of that ocean.
[0] https://en.wikipedia.org/wiki/Drosophila_connectome
[1] https://en.wikipedia.org/wiki/Hydranencephaly
[2] https://en.wikipedia.org/wiki/Memristor
>You'll see in literature that there are people with some 'lucky' form of hydranencephaly where their brain is as thin as paper. But they vote, get married, have kids, and for some strange reason seem to work in mailrooms (not a joke). So we know it's something about the connectome that's the 'magic' of a human.
These cases seem totally fascinating. Have you any links to examples or more information (i'm also curious about the curious detail of them tending to work in mail rooms)?
It is possible that we simply haven't yet discovered those natural laws for "emergent behavior" from the "messy substrate".
[dead]
> it has made me question whether such a thing even exists
I was reading a reddit post the other day where the guy lost his crypto holdings because he input his recovery phrase somewhere. We question the intelligence of LLMs because they might open a website, read something nefarious, and then do it. But here we have real humans doing the exact same thing...
> I guess humans really aren't so special after all
No they are not. But we are still far from getting there with the current LLMs and I suspect mimicking the human brain won't be the best path forward.
> But here we have real humans doing the exact same thing...
I'd wager that a motivation in designing these systems it so they do not make these mistakes. Otherwise what's the point, really.
3 replies →
I didn't see where he was disagreeing with this.
I'm assuming this was the part you were saying he doesn't hold, because it is pretty clear he holds the second thought.
I have a difficult time reading this as saying that LLMs aren't fantastic and useful.
This seems to be the core of his conversation. That he's talking about the side of science, not engineering.
It indeed baffles me how academics overall seem so dismissive of recent breakthroughs in sub-symbolic approaches as models from which we can learn about 'intelligence'?
It is as if a biochemist looks at a human brain, and concludes there is no 'intelligence' there at all, just a whole lot of electro-chemical reactions. It fully ignores the potential for emergence.
Don't misunderstand me, I'm not saying 'AGI has arrived', but I'd say even current LLM's do most certainly have interesting lessons for Human Language development and evolution in science. What can the success in transfer learning in these models contribute to the debates on universal language faculties? How do invariants correlated across LLM systems and humans?
>It fully ignores the potential for emergence.
There's two kinds of emergence, one scientific, the other a strange, vacuous notion in the absence of any theory and explanation.
The first case is emergence when we for example talk about how gas or liquid states, or combustibility emerge from certain chemical or physical properties of particles. It's not just that they're emergent, we can explain how they're emergent and how their properties are already present in the lower level of abstraction. Emergence properly understood is always reducible to lower states, not some magical word if you don't know how something works.
In these AI debates that's however exactly how "emergence" is used, people just assert it, following necessarily from their assumptions. They don't offer a scientific explanation. (the same is true with various other topics, like consciousness, or what have you). This is pointless, it's a sort of god of the gaps disguised as an argument. When Chomsky talks about science proper, he correctly points out that these kinds of arguments have no place in it, because the point of science is to build coherent theories.
>not some magical word if you don't know how something works.
I'd disagree, emergence is typically what we don't understand. When we understand it, it's rarely considered an emergent concept, just something that is.
>They don't offer a scientific explanation.
Correct, because we don't have the tooling necessary to explain it yet. Emergence as you stated came from simpler concepts at first, for example burning hydrogen and oxygen and water emerges from that.
Ecosystems are an emergent property of living systems, ones that we can explain rather well these days after we realized there were gaps in our knowledge. It's taken millions and millions of hours of research to piece all these bits together.
Now we are at the same place in large neural nets. What you say is pointless is not pointless at all. It's pointing at the exact things we need to work on if we want to have understanding of it. But at the same time understanding isn't necessary. We have made advancements in scientific topics that we don't understand.
> There's two kinds of emergence, one scientific
I am not aware of any scientific kind of emergence. There's philosophical emergence, and its counterpoint - ontological reductionism.
Most people have an intuitive sense that philosophical emergence is true, and that bubbles up in their writing, taken as an axiom that we're all supposed to go along with.
On closer inspection, it is not clear to me that this isn't simply a confusion or illusion caused by the tendency of the human mind to apply abstractions and socially constructed categories on top of complicated phenomena, and those abstractions are confused for actual effects that are different from the underlying base-level phenomena being described.
1 reply →
Nobody claims mistical gaps. There is no deus ex machina claim in emergence. However e.g. stable phenomena at a higher model level might be fully dynamic at a lower level model.
> the major breakthroughs of AI this decade have not, at least so far, substantially deepened our understanding of our own intelligence and its constitution
People's illusions and willingness to debase their own authority and control to take shortcuts to optimise towards lowest effort / highest yield (not dissimilar to something you would get with... auto regressive models!) was an astonishing insight to me.
Well said. It's wild when you think of how many "AI" products are out there that essentially entrust an LLM to make the decisions the user would otherwise make. Recruitment, trading, content creation, investment advice, medical diagnosis, legal review, dating matches, financial planning and even hiring decisions.
At some point you have to wonder: is an LLM making your hiring decision really better than rolling a dice? At least the dice doesn't give you the illusion of rationality, it doesn't generate a neat sounding paragraph "explaining" why candidate A is the obvious choice. The LLM produces content that looks like reasoning but has no actual causal connection to the decision - it's a mimicry of explanation without true substance of causation.
You can argue that humans do the same thing. But post-hoc reasoning is often a feedback loop for the eventual answer. That's not the case for LLMs.
> it doesn't generate a neat sounding paragraph "explaining" why candidate A is the obvious choice.
Here I will argue that humans do the same thing. For any business of any size recruitment has been pretty awful in recent history. The end user, that is the manager the employee will be hired under is typically a later step after a lot of other filters, some automated some not.
At the end of the day the only way is to measure the results. Do LLMs produce better hiring results than some outside group?
Also, LLMs seem very good at medical pre-diagnosis. If you accurately portray your symptoms to them they come back with a decent list of possible candidates. In barbaric nations like the US where medical care can easily lead to bankruptcy people are going to use it as a filter to determine if they should go in for a visit.
Chompsky's central criticism of LLMs is that they can learn impossible languages just as easily as they learn possible languages. He refers to this repeatedly in the linked interview. Therefore, they cannot teach us about our own intelligence.
However, a paper published last year (Mission: Impossible Language Models, Kallini et al.) proved that LLMs do NOT learn impossible languages as easily as they learn possible languages. This undermines everything that Chompsky says about LLMs in the linked interview.
I'm not that convinced by this paper. The "impossible languages" are all English with some sort of transformation applied, such as shuffling the word order. It seems like learning such languages would require first learning English and then learning the transformation. It's not surprising that systems would be worse at learning such languages than just learning English on its own. But I don't think these sorts of languages are what Chomsky is talking about. When Chomsky says "impossible languages," he means languages that have a coherent and learnable structure but which aren't compatible with what he thinks are innate grammatical facilities of the human mind. So for instance, x86 assembly language is reasonably structured and can express anything that C++ can, but unlike C++, it doesn't have a recursive tree-based syntax. Chomsky believes that any natural language you find will be structured more like C++ than like assembly language, because he thinks humans have an innate mental facility for using tree-based languages. I actually think a better test of whether LLMs learn languages like humans would be to see if they learn assembly as well as C++. That would be incomplete of course, but it would be getting at what Chomsky's talking about.
Also, GPT-2 actually seems to do quite well on some of the tested languages, including word-hop, partial reverse, and local-shuffle. It doesn't do quite as well as plain English, but GPT-2 was designed to learn English, so it's not surprising that it would do a little better. For instance, they tokenization seems biased towards English. They show "bookshelf" becoming the tokens "book", "sh", and "lf" – which in many of the languages get spread throughout a sentence. I don't think a system designed to learn shuffled-English would tokenize this way!
https://aclanthology.org/2024.acl-long.787.pdf
The authors of that paper misunderstand what "impossible languages" refers to. It doesn't refer to any language a human can't learn. It refers to computationally simple plausible alternative languages that humans can't learn, in particular linear-order (non-hierarchical structure) languages.
What exactly do you mean, "analogous to our own" and, "in a deep way" without making an appeal to magic or non-yet discovered fields of science? I understand what you're saying but when you scrutinize these things you end up in a place that's less scientific than one might think. That kind of seems to be one of Chomsky's salient points; we really, really need to get a handle on when we're doing science in the contemporary Kuhnian sense and philosophy.
The AI works on English, C++, Smalltalk, Klingon, nonsense, and gibberish. Like Turing's paper this illustrates the difference between, "machines being able to think" and, "machines being able to demonstrate some well understood mathematical process like pattern matching."
https://en.wikipedia.org/wiki/Computing_Machinery_and_Intell...
> not, at least so far, substantially deepened our understanding of our own intelligence
Science progresses in a manner that when you see it happen in front of you it doesn't seem substantial at all, because we typically don't understand implications of new discoveries.
So far, in the last few years, we have discovered the importance of the role of language behind intelligence. We have also discovered quantitative ways to describe how close one concept is from another. More recently, from the new reasoning AI models, we have discovered something counterintuitive that's also seemingly true for human reasoning--incorrect/incomplete reasoning can often reach the correct conclusion.
In my opinion it will or already has redefined our conceptual models of intelligence - just like physical models of atoms or gravitational mechanics evolved and newer models replace the older. The older models aren't invalidated (all models are wrong, after all), but their limits are better understood.
People are waiting for this Prometheus-level moment with AI where it resembles us exactly but exceeds our capabilities, but I don't think that's necessary. It parallels humanity explaining Nature in our own image as God and claiming it was the other way around.
> if the intelligence of LLMs proves to be analogous to our own in some deep way
First, they have to implement "intelligence" for LLMs, then we can compare. /s