← Back to context

Comment by Smaug123

14 days ago

From some Googling and use of Claude (and from summaries of the suggestively titled "Impossible Languages" by Moro linked from https://en.wikipedia.org/wiki/Universal_grammar ), it looks like he's referring to languages which violate the laws which constrain the languages humans are innately capable of learning. But it's very unclear why "machine M is capable of learning more complex languages than humans" implies anything about the linguistic competence or the intelligence of machine M.

Firstly, can't speak for Chomsky.

In this article he is very focused on science and works hard to delineate science (research? deriving new facts?) from engineering (clearly product oriented). In his opinion ChatGPT falls on the engineering side of this line: it's a product of engineering, OpenAI is concentrating on marketing. For sure there was much science involved but the thing we have access to is a product.

IMHO Chomsky is asking: while ChatGPT is a fascinating product, what is it teaching us about language? How is it advancing our knowledge of language? I think Chomsky is saying "not much."

Someone else mentioned embeddings and the relationship between words that they reveal. Indeed, this could be a worthy area of further research. You'd think it would be a real boon when comparing languages. Unfortunately the interviewer didn't ask Chomsky about this.

It doesn't, it just says that LLMs are not useful models of the human language faculty.

  • This is where I'm stuck.

    For other commentators, as I understand it, Chomsky's talking about well-defined grammar and language and production systems. Think Hofstadter's Godel Escher Bach. Not "folk" understanding of language.

    I have no understanding or intuition, or even a finger nail grasp, for how an LLM generates, seemingly emulating, "sentences", as though created with a generative grammar.

    Is any one comparing and contrasting these two different techniques? Being noob, I wouldn't even know where to start looking.

    I've gleaned that someone(s) are using LLM/GPT to emit abstract syntax trees (vs a mere stream of tokens), to serve as input for formal grammars (eg programming source code). That sounds awesome. And something I might some day sorta understand.

    I've also gleaned that, given sufficient computing power, training data for future LLMs will have tokenized words (vs just character sequences). Which would bring the two strategies closer...? I have no idea.

    (Am noob, so forgive my poor use of terminology. And poor understanding of the tech, too.)

    • I don't really understand your question but if a deep neural network predicts the weather we don't have any problem accepting that the deep neural network is not an explanatory model of the weather (the weather is not a neural net). The same is true of predicting language tokens.

      7 replies →