A colleague of mine wrote to me about my new paper, “A Reclassification of the Chinese Language” which has just been uploaded to Academia. This paper reanalyzes the Chinese or Sinitic language family from the 14 languages that Ethnologue to 343 language based on the criterion of mutual intelligibility with >90% = dialect, and <90% = separate language.
Your topic is deep and complex. I find it valuable even if it’s still just a pilot.
All works by Hillary Chappel might be of interest for you. However she studies the typology of dialect, you implement a criterion of mutual intelligibility.
Only one point on which I feel you should be cautious from the start: intelligibility is not always bidirectionnal; e.g.
1) Portuguese speakers understand Spanish speakers better than the contrary, and this is due to differences in the phonological system which much more complex in Portuguese.
In the case of Portuguese v Spanish, we average them together.
Portuguese has 58% intelligibility of Spanish.
Spanish has 50% intelligibility of Portuguese.
Average together is 54%.
Factor out Bilingual Learning
One thing I always try to factor out is bilingual learning. You want to try for “virgin ears.” Bilingual learning means that the results are not valid. We look for “inherent intelligibility.” Males often have been intelligibility of Lect B simply because they have heard it for work while females have stayed at home and not heard it. A friend of mine lives in a city called Leyang (?) in Central Hebei. She said once you go three cities over, you get to a new language. She said it takes ~3 weeks of close contact with the new language to pick it up, often due to tone differences. 3 weeks of close contact is about right for closely related languages with ~80% intelligibility.
A famous Sinologist agreed with me that a 90% cutoff is a good cutoff. Once you start getting below 90%, you start running into intelligibility issues, mostly in more complex and technological speech but also in everyday speech.
Native speakers are actually excellent informants as far mutual intelligibility goes. You only have to find decent, normal, scientific minded people of some intelligence.
Bugbears
However, there are two types whose opinions should simply be dropped:
Problem: “Everyone can understand everyone.” “I can understand everyone.” The first one is usually some sort of nationalist. The second one may be full of it, very smart or maybe very good at languages. I am not interested at how Mr. Amateur Linguist can understand Lect A. I want to know about how well Joe Average Peasant understands Lect A under normal day to day conditions.
False alarm: “No one can understand each other.” “We cannot understand them.” This person is lying, and everyone can understand everyone or Group A can indeed understand Group B. Why are they lying? They are doing this because they want to feel their speech is unique and they are different from the other speakers or they simply do not like the other speakers and do not want to believe they speak the same language as their rivals.
Realistically, I never run into people like this. This is a straw man thrown up by linguists who are nervous about the mutual intelligibility concept.
Native Speaker Judgment
Just ask native speakers. Most native speakers are excellent informants and will inform you straight off whether or not or how well they can understand the other speakers. Many of these people are of limited education or even intelligence, but they can often give you a rather oddly accurate figure like “65%.” Then you will ask several other “ignorant peasants” and they will also tell you “70%” or “60%.” It is amazing what sort of intuitive judgment your average native speaker has in this area. Where you get different numbers, just add them together. If you get enough informants, the average should look pretty good.
“There Is No Way to Accurately Gauge Mutual Intelligibility”
A red herring very popular among the “Physics envy” crowd in Linguistics where it seems we can never really measure or define much of anything as the social sciences measure subjective and variable human beings. Scientific intelligibility studies are very good, and we now have them down to a fine art. Linguists who claim this is not measurable are simply ignorant. SIL and Ethnologue have this down to a fine art.
“Only Experts Can Answer These Questions”
A red herring. “Don’t ask native speakers, ask a linguist.” This argument comes from linguists and discounts native speaker judgements about intelligibility, arguing instead that these judgements are properly made only by linguists, apparently those who have studied the language. Why a scholar knows Language A better than the folks who speak it is beyond me. The reason for this argument is apparently that native speakers lie.
But my experience has been that they generally do not, and anyway, the only real liars (over-estimators who say Lect A understands Lect B perfectly – often nationalists – are easily ferreted out.) Also this is an elitist attitude that seems to say that native speaker judgements should be discounted in favor of the truth-speakers from the ivory towers. This elitist argument is disturbing.
Be Careful of the Judgements of Second Language Speakers
Some “hazy” lects are mostly just hard to understand for 2nd language speakers. This may be the case with Beijinghua and Berlinerisch, both of which 2nd language speakers of Chinese and German consider infernal however most Chinese and Germans find them odd, funny, or crazy but nevertheless more or less intelligible.
“Our Language” Versus “Not Our Language”
“Language” and “not our language” is well understood.
People have a good knowledge of “what is our language.” For instance I spoke with someone who spoke a language called Bergish spoken around Dusseldorf (not listed in Ethnologue – many German languages are not listed). I went through a number of the surrounding cities, and she would say, “That is our language” or “They speak our language,” often adding that some words were different but it was clearly the same tongue.
When I got all the way over to Aix, she told me that was absolutely not her language anymore. She didn’t have a name for it other than Aacher Platt, but it was another language all right. Intelligibility was “about 60%.” I wondered if her language was spoken over the border in the Netherlands (there was a suggestion that it might be) and she simply told me that she could not understand any lects spoken in the Netherlands. Therefore, Bergish apparently does not cross the border and is instead the South Gulderish language in the Netherlands.
Over around Aix where Belgium, Germany and the Netherlands all come together, I had an informant in Stolberg, Germany who spoke a language called Southeast Limburgs (also not listed in Ethnologue) who spoke of the tongue a bit further away in Cologne (Kolsch Ripaurian) as not quite his language. “That is 70% our language,” he told me. I am not sure what he meant by that, but I figured 70% intelligibility and a new tongue. We went up and over to Belgium in the Three Countries area around Kerkrade and he thought a bit and said, “That is still our language. It is 90% our language.” I figured 90% intelligibility and a Kerkraads as part of Southeast Limburgs.
The Troublesome Politics of Many New Languages
One problem with my 90% marker is that my Sinologist friend told me that if he set Sinitic tongues at 90% intelligibility, he could easily come up with 2,000 Sinitic languages. That would certainly create a few waves, if not deadly tsunamis, and the Chinese government would not be pleased, but at the very least it would be a nice hypothesis to see get tossed about in the lit by sober-minded linguistic scientists.
Structural Differences Are Better than Mutual Intelligibility
Chappel is doing something similar. At some point, linguists speak of “structurally separate languages.” At some X degree of differentiation, the lects are just too different structurally and hence are thought to be different tongues. This is supposed to be an argument against my mutual intelligibility criteria, but actually it is rather circular as once two lects start getting far enough apart to where linguists want to split them off as structurally separate languages, this same differentiation that justified the split also starts to impede mutual intelligibility, often below the 90% mark. So these two criteria end up with the same conclusions anyway.
Lexical Similarity Is Better than Mutual Intelligibility
Lexical similarity. We are supposed to look at this instead of mutual intelligibility also, but it is misleading. Asturian has 95% lexical similarity with Spanish, yet Spanish intelligibility of hard Leonese is as low as 25%. As you can see, lexical similarity is not quite accurate for deciding against a split, although it may be helpful in deciding for a split. Hence, the North Frisian “dialects” only have 65-70% lexical similarity, so they cannot possibly be dialects of a single tongue and most be separate languages.
Natural Barriers and Transitional Lects, not Mutual Intelligibility
Once again this ties in very well with mutual intelligibility. Intelligibility in Asturian is good except where it transitions into other languages. In the west, it transitions into Galician and we start getting intelligibility issues and probably a new language in between Galician and Asturian (Eonavian). In the east, Asturian transitions in Castillian around Eastern Asturian and Cantabrian and we start getting into a new language (Cantabrian-Extremaduran.) These two arguments are not competing, instead they are eating each other’s tails.
Bilingual Learning Ability Is Overrated
My Sinologist friend told me that he knows Mandarin speakers who have been living in Hong Kong surrounded by Cantonese speakers for 20 years who do not know one word of Cantonese. This implies right there that they might be quite different and we are surely dealing with two separate tongue. If Cantonese and Mandarin were dialects of a single tongue, there is no way that Mandarin speaker could spend 20 years living in Hong Kong and never learn one word of Cantonese.
Impressionistic Judgements Are Not Scientific
The problem here is that impressionistic judgements are quite accurate in terms of determining dialect from language. I often hear hazy judgements like, “You know, we can’t really understand those people very well” or “Sometimes I have a hard time understanding them,” but I can tell by listening to them that a lot of their words are the same as hours.”
Although those are impressionistic judgements, when speakers start talking like that, they are generally referring to intelligibility somewhere around 80-90%. At over 90% intelligibility, in general, no non-ideologue ever says that Lect B is different from Lect A. They always say it is the same language, and often add that maybe some words are different here and there. Once again this is an impressionistic judgement, but it is generally accurate.
The problem with Chinese is that I got the impression that the vast field of Chinese linguistics in China (and the volume of work is truly huge by Chinese scholars and students) has written many times on mutual intelligibility and the like, however, when I was doing this study, so much of the best work I wanted to read always seemed to be in Chinese, and I simply cannot read Chinese at all. No study of this subject will ever be done properly until the Chinese literature and hopefully Chinese scholars and students themselves are involved.
I have a lot more to say about this but I will leave it at this for now.