Now this is interesting! I've always been very dubious about Chomsky's innateness theories; does #ChatGPT show anything interesting about whether particular innate structures are, or more likely are not, actually needed to learn human language?
Even if they aren't needed, there could still be a claim that humans do have them and it shows in the way we learn language, which one could argue is different than the way an LLM does; but this is still an interesting start.
@pre
Hm, that's an interesting angle. I don't remember him ever saying that language could be learned without intrinsic structure, if only you had enough data, but maybe he did somewhere.
@ceoln His point was definitely that the kids don't get enough exposure to learn it without inane clues.
He didn't to my knowledge also say that it could be learned if you did get enough data. H probably never imagined a machine that could read the whole internet.
He just thought kids didn't get enough data to do it without innate clues.
@pre
Right; I wonder if he's ever thought about how much WOULD be enough, or if as you say he hadn't encountered that idea until very recently. 😁
@ceoln
Here’s a survey article on large language models and how they undermine Chomsky’s approach to language: https://lingbuzz.net/lingbuzz/007180
It’s a good survey article, but I read it with a grain of salt, and want to spend time with the bibliography.
Here’s another good survey article on the 60-year feud between generative linguistics and neural net based theories of how the mind works:
https://sites.socsci.uci.edu/~lpearl/colareadinggroup/readings/Pater2019_GenLingDeepLearning.pdf
@lain_7
"Note that this specific example was not in the model’s training set—there is no
possibility that Trump understands prime numbers." 😂
@ceoln As others have said, a big part of Chomsky’s argument is known as “the poverty of stimulus“ argument — kids don’t hear enough input to form the grammar understanding they evidence. There’s a second aspect: kids don’t make the mistakes they would if they didn’t have innate theories about how languages work.
Okay, so there’s some evidence that these statistical learning models pick up grammar with a tiny fraction of their input — the effect of additional input gives them facts about the world that they can parrot, but doesn’t improve their grammar. That undermines the poverty of stimulus argument.
So what about the “don’t make mistakes they would if their grammar theories weren’t constrained” argument? Statistical learning models don’t really seem to touch that. They make all kinds of mistakes during their training phase.
Plus, humans have trouble learning artificial languages that break some of Chomsky’s ground rules about phrase structure, while the robots learn those languages just fine, so their learning process is different from what humans do.
@ceoln So, finally, a problem with Chomsky’s innateness idea is: what’s the innate bit? Chomsky has paired it down to something he calls “merge”, which I understand to be akin to recursion, or perhaps analogous to a stack. Everett (the guy interviewed by the Tehran Times article you cited) has studied an Amazon tribe whose language, he thinks, doesn’t display evidence of “merge”. Everett is the only person who has spent substantial time with these people, so to argue with him about this strikes me as going out on a limb.
(Sorry to be all “reply guy”. Hope this was interesting.)
@lain_7
No, thanks much, that's all really interesting! I should read some of this papers; the idea that (say) an LLM can learn some non-Chomskian (artificial) languages just as well as they can learn human ones, but people have more trouble with them, for some reason tickles me.
@ceoln I think chomsky didn't claim that language can't be learned by lots of examples, he said children aren't exposed to enough of it to be have the data to learn.
Chat GPT can't do it by just the amount of talking a 5 year old has heard, it needs to read nearly the entire internet.