Mandarin dialects: Unity in diversity

by on October 5, 2019

Most linguistics students trained in Hong Kong have attended courses which focus on the comparison of Mandarin and Cantonese. While courses of this kind can help us appreciate the fascinating diversity of the Chinese language(s), they may also create or reinforce the common impression that Mandarin, despite its size, is an astoundingly uniform language which barely leaves any room for internal variation. Pretty soon I realized that there must be something wrong with this notion—when discussing syntax questions, “native speakers” of Mandarin often had considerable disagreements over the acceptability (or grammaticality) of certain Mandarin sentences, a phenomenon rarely observed in other languages (including Cantonese).

Later on, I was blessed with the opportunity to get exposed to Jianghuai Mandarin and Southwest Mandarin, which gave me a new, broader perspective on Mandarin, as well as the Chinese language(s) as a whole.

ŋə˦ tʰa˦ mɔ˥˧ ʂuo˩˧ kuɔ
I he not say experiential
‘I didn’t tell him.’ (adapted from Dwyer 1995)


kɯ˥˧ ʐɯ˨˩˧ xa tʂʰʅ˦ liɔ˥˧
dog meat [xa] eat perfective
‘The dog ate the meat.’ (adapted from Dede 2007a)


Do you speak Mandarin? If your answer is yes, you may find the above sentences pretty weird. While you are likely to recognize each and every word, they appear to be in a chaotic order; you may even find it difficult to understand the meaning of the sentences without looking at the translations. What’s more, if you came across these sentences in real-life scenarios (with no Chinese characters or glosses provided), you’d probably struggle to make any sense of them since the words would be pronounced rather differently from the Mandarin language you’re familiar with.

What about the following sentence?

(3) 晓得 好久 不?
ȵi˥˧ ɕiau˥˧te˨˩ tʰa˥ xau˥˧tɕiəu˥˧ nai˨˩ pu˨˩
you know he when come question
‘Do you know when he’ll come back?’ (adapted from Li 2002)


In this case, while there may be nothing unusual about its sentence structure, you may still have difficulty understanding the sentence because the actual meaning of some seemingly familiar words is probably quite different from what you expected. You may well be surprised to learn that all the examples illustrated above are in fact grammatical sentences of some varieties of Mandarin.

Yes, Mandarin can be quite different from what we learn and know from Mandarin Chinese textbooks and dictionaries.

Mandarin?! Seriously?

Spoken by over 900 million people as their mother tongue, Mandarin is not only the largest language in the world by number of native speakers (Simons & Fennig 2018), but also an increasingly popular choice among foreign language learners. When we talk about Mandarin, what comes to mind is typically a major lingua franca rising to global prominence, or a monolithic linguistic superpower displacing Chinese “dialects” like Cantonese, Hokkien, and Hakka in various domains, driving some less well-established ones like Minjiang, Weitou, and Shehua to the verge of endangerment or even extinction. Naturally, few would expect to see such a dominant, well-known, and well-studied language in the Language Profiles section.

A lesser-known fact about Mandarin is that it is a polysemous term. In common usage, Mandarin typically refers to a standardized form of the Chinese language spoken as a national and/or intra-ethnic lingua franca in Mainland China (as Putonghua 普通话), Taiwan (as Guoyu 國語), Singapore and Malaysia (as Huayu 华语). Although the various national standards differ from each other in a number of ways (Bradley 1992), they still maintain a very high degree of mutual intelligibility and do not constitute the focus of this article.

Adopting Sanders’ (1987) terminology, the Standard Mandarin varieties belong to “Idealized Mandarin”, which was (artificially) constructed based on the Beijing dialect in the early 20th century to facilitate nationwide communication (Moser 2016; Weng 2018). Meanwhile, in this article, my main focus is on the (naturalistic) regional vernaculars of Chinese which, in a linguistic sense, belong to a Chinese dialect group known as Mandarin, i.e. “Geographical Mandarin” according to Sanders’ (1987) terminology.

Mandarin as a Chinese dialect group

Although often considered a single language, Chinese dialects (aka Sinitic languages) carry a degree of internal diversity on a par with that of the Romance (e.g. Portuguese, Spanish, Catalan, French, Romansh, Italian, Romanian) or Germanic (e.g. English, German, Dutch, Frisian, Swedish, Danish, Norwegian, Icelandic) languages within the Indo-European family (Norman 1988 Chappell 2001). In modern Chinese dialectology, Chinese is classified into 10 major dialect groups, namely Mandarin 官话, Jin 晋语, Wu 吴语, Hui 徽语, Gan 赣语, Xiang 湘语, Min 闽语, Hakka 客家话, Yue 粤语, Pinghua 平话 and Tuhua 土话 (Zhang 2012).

Source: Zhang 2012: Map A2

Native speakers of Mandarin account for around 70% of the Chinese-speaking population in China (Zhang 2012). Geographically, Mandarin dialects are spoken over a huge area in China, stretching from the Manchurian region in the northeast all the way to the border region in Yunnan in the southwest (the yellow region on the map above), occupying the vast majority of the Han Chinese region north of the Yangtze River.

Classification of Chinese dialect groups is based primarily on phonological criteria, especially the diachronic development of various Middle Chinese sound categories. For example, Mandarin has lost the Middle Chinese [m], [p], [t], [k] codas (which means that these sound units—or phonemes—do not occur in the syllable-end position in Mandarin words), which are preserved to different degrees in most non-Mandarin Southern Sinitic varieties (see the table below). Interested readers may refer to Norman (1988) and Kurpaska (2010) for further phonological features which define the Mandarin dialect group.

‘one’ 一 ‘three’ 三 ‘six’ 六 ‘ten’ 十
Middle Chinese *ʔit *sam *luwk *dʑip
Beijing Mandarin san˥ lioʊ˥˩ ʂi˧˥
Xi’an Mandarin i˨˩ sæ̃˨˩ liou˨˩ ʂʅ˨˦
Yinchuan Mandarin i˩˧ san˦ lu˩˧ ʂʅ˩˧
Chengdu Mandarin i˨˩ san˥ nu˨˩ sɿ˨˩
Nanjing Mandarin iʔ˥ sɑŋ˧˩ luʔ˥ ʂʅʔ˥
Suzhou Wu ʔiəʔ˥ sE˥ loʔ˧ zəʔ˧
Nanchang Gan it˥ san˦˨ liuʔ˥ sɨt˨
Xiamen Min it˩ sam˥ liɔk˥ sip˥
Meixian Hakka it˩ sam˦ liuk˩ səp˥
Guangzhou Yue t˥ sam˥ lok˨ sɐp˨

Source: Pulleyblank (1991) and The Great Dictionary of Modern Chinese Dialects by Li (2002)

Mandarin dialects also share a range of basic vocabulary items. As Norman (1988) observes, the following seven lexical items are uniform across Mandarin dialects:

(i) The third-person pronoun is tā 他or cognate to it

(ii) The subordinative particle is de(di) 的or cognate to it

(iii) The ordinary negative is bù 不or cognate to it

(iv) zhàn 站 or words cognate to it are used for ‘to stand’

(v) zǒu 走 or words cognate to it are used for ‘to walk’

(vi) érzi 儿子or words cognate to it are used for ‘son’

(vii) fángzi 房子 or words cognate to it are used for ‘house’

Are Mandarin dialects mutually intelligible?

Chinese dialectologists usually classify Mandarin into eight subgroups, namely Northeast 东北, Beijing 北京, Jilu 冀鲁, Jiaoliao 胶辽, Central Plains 中原, Lanyin 兰银, Jianghuai 江淮, and Southwest 西南 (Zhang 2012). Although it is agreed that the subgroups differ from each other phonologically, Chinese dialectologists generally regard Mandarin as a homogeneous group with a very high level of mutual intelligibility:

A person from Harbin in Northern Manchuria has little difficulty understanding a native of Kunming some 3,200 kilometers away (Yuan 1960).

Mandarin dialects have a high degree of uniformity—speakers of different Mandarin dialects, like a Harbin speaker from Heilongjiang, an Urumqi speaker from Xinjiang, a Kunming speaker from Yunnan, and a Nanjing speaker from Jiangsu, can readily communicate with each other using their native dialect (Li & Xiang 2009).

Despite their prevalence in the field, claims of this kind should be taken with a pinch of salt. Yes, speakers of different Mandarin dialects can readily communicate with each other as long as they are reasonably proficient in Standard Mandarin. When discussing the mutual intelligibility between different Mandarin dialects, we must always draw a clear distinction between Mandarin dialects (i.e. local vernaculars which belong to the Mandarin dialect group) and the regional varieties of Standard Mandarin (i.e. Standard Mandarin spoken with different regional accents, aka “Local Mandarin” according to Sanders (1987)).

If Mandarin dialects were indeed that homogeneous, we would expect any proficient speaker of Standard Mandarin (which is based largely on Beijing Mandarin), regardless of their linguistic and/or geographical background, to be able to understand any Mandarin dialect with ease. Anyone with some basic knowledge of Standard Mandarin and a handful of Mandarin dialects can tell that this is an unrealistic expectation. You don’t have to plan a three-month field trip to some remote villages to appreciate the incredible diversity among the Mandarin subgroups. Just go to major cities like Xi’an (Central Plains), Dalian (Jiaoliao), Chengdu (Southwest), or Nanjing (Jianghuai), and pay attention to the vernaculars spoken among the locals, especially the middle-aged and elderly. Alternatively, you may simply do a YouTube (or Baidu) search on any well-known Mandarin dialect (not limited to the aforementioned ones); in a matter of minutes, you can gain exposure to myriads of exotic-sounding dialects.

The following two videos involve conversations between a speaker of Standard Mandarin and that of a local Mandarin dialect, where the latter can understand Standard Mandarin but cannot really speak it. Communication is therefore still marginally possible in these cases. Imagine what will happen if the two speakers of Mandarin dialects have to communicate with each other!

Sichuanese, a representative variety of Southwest Mandarin:

Dalian Mandarin, a representative variety of Jiaoliao Mandarin:

For a more colloquial impression of these Mandarin dialects, listen to these candid—and slightly rude—examples of Sichuanese and Dalian Mandarin.


Dalian Mandarin:

Put simply, the “intelligibility claim” is divorced from reality. More specifically, according to the personal experience of friends and colleagues from various Mandarin-speaking regions, without prior exposure, speakers of different Mandarin dialects often have difficulty understanding each other’s local vernacular even if they come from one and the same province, provided that two or more distinct subgroups of Mandarin are spoken therein.

Typical examples include Shandong (Jiaoliao, Jilu, Central Plains), Jiangsu (Central Plains and Jianghuai), and Hubei (Jianghuai and Southwest). In some cases, mutual intelligibility is not guaranteed even if the Mandarin dialects concerned belong to the same subgroup and are spoken within the same province. A native speaker of the Zhenjiang dialect (a Jianghuai Mandarin dialect spoken in the Jiangsu province) reported that it is impossible for her to understand the Nantong dialect (another Jianghuai Mandarin dialect spoken around 140 kilometers away from her neighbourhood in the same province).

Variation between Northern Sinitic and Southern Sinitic

Of course, as linguists, we cannot make any strong claim based on gut feeling and anecdotal evidence. Intrigued by the remarkable diversity within the Mandarin dialect group, I decided to conduct a typological survey of 26 Mandarin dialects, plus 16 dialects which belong to other Chinese dialect groups. The resultant research article, co-authored with my supervisors Umberto Ansaldo and Stephen Matthews, was recently published in Linguistic Typology (Szeto et al. 2018). The results are in stark contrast to the common belief in a homogeneous Mandarin dialect group, but highly consistent with our preliminary observations—Mandarin dialects demonstrate internal variation in all major domains of grammar (phonology, morphosyntax, semantics, and grammaticalization patterns).

Adopting a quantitative approach, we find that the degree of typological diversity within the Mandarin dialect group is comparable to that of the Sinitic branch as a whole. This implies that, if the various Chinese dialect groups are indeed as internally diverse as the Romance or Germanic languages, the Mandarin dialect group alone may carry such a degree of internal diversity from a typological (or structural) perspective!

The extensive geographical range of Mandarin can help explain its typological diversity. Sandwiched between Altaic languages (e.g. Manchu, Mongolian, Uyghur) to the north and Tai languages (e.g. Zhuang, Lao, Thai) to the south, Sinitic as a whole can be considered typologically intermediate between these two groups of languages. A north-south divide, whose boundary is conventionally drawn along the Qinling Mountain-Huaihe River Line, is evident in the Sinitic branch.

Source: Wikipedia Commons

Northern Sinitic shows signs of typological convergence towards Altaic languages (Hashimoto 1976) and Southern Sinitic towards Tai languages (Bennett 1979). For instance, the northern varieties tend to have a smaller number of numeral classifiers, monosyllabic words, tones and codas, as well as a stronger tendency to head-final structures, to be exemplified below.

Transcending the Qinling Mountain-Huaihe River Line, the Mandarin dialect group also displays a north-south divide in typological features. The adjective-final comparative constructions (e.g. Standard Mandarin ) represent a typical example of head-final structures, where the “head” of the phrase in question (e.g. the adjective in an adjective phrase, or the noun in a noun phrase) is at the end of the phrase.

(4) [Standard Mandarin]
I compare he tall
‘I’m taller than him.’


While this sentence may look perfectly natural to Mandarin speakers, cross-linguistically speaking, the head-final adjective phrase actually correlates with SOV languages (Dryer 1992), which typically have the parts of the sentence structured in the order of subject-object-verb. It is in fact unusual for an SVO language like Mandarin to possess such a word order. Unsurprisingly, the adjective-final comparative constructions are more common in Northern China, where influence from the SOV Altaic languages is relatively profound. Meanwhile, the surpass comparatives (where a verb meaning ‘to cross/surpass’ has developed into a comparative marker) predominate in Southern China, as well as Mainland Southeast Asia (Ansaldo 2010).

(5) 佢/他
ngo5 gou1 gwo2 keoi5 [Cantonese]
ŋo˥˧ kɑ˦ ko˨˦ tʰɑ˦ [Liuzhou Mandarin]
ŋo˥˧ kɑu˥ ko˨˦ lɑ˥ [Guiyang Mandarin]
I tall surpass he
‘I’m taller than him.’


Another example of head-final structures common in Chinese is the Adjective-Noun order (e.g. 小狗 xiǎo-gǒu “small-dog”, 高山 gāo-shān “high-mountain”, 白衣 bái-yī “white-clothes”). Although Adjective-Noun is the dominant order in all known Chinese dialects, the Noun-Adjective order (which is prevalent in Mainland Southeast Asia), is found in a small subset of nominal constructions in Southern Chinese dialects, as in the animal gender constructions. For example, in Northern Chinese dialects, the word for ‘rooster’ is the cognate form of the Standard Mandarin 公鸡 gōng-jī “male-chicken”; in many Southern Chinese dialects, however, 鸡公 “chicken-male” is the more common word order, as in the Cantonese gai5-gung1, Hokkien kue˩-kak˩, Wuhan Mandarin tɕi˥-koŋ˥, Chengdu Mandarin tɕi˥-koŋ˥, and Liuzhou Mandarin ki˦-koŋ˦.

The Amdo Sprachbund

The above examples may not look particularly remarkable to speakers of Southern Sinitic varieties like Cantonese and Hokkien—after all, as those word order features are typical of Southern Sinitic, their presence in Southern Mandarin dialects may not come as a surprise. The truth is that the most interesting Mandarin dialects are not found in Southern China.

In Northwestern China, there is a linguistic area in the Southeastern Qinghai-Gansu border region known as the Amdo Sprachbund (Janhunen 2012; Sandman & Simon 2016). Comprising around 15 language varieties, the Amdo Sprachbund is a region of great ethnic and linguistic diversity, where Amdo Tibetan has served as the lingua franca for centuries. Remember the first two examples in the beginning of the article?

(1’) [Xunhua Mandarin]
ŋə˦ tʰa˦ mɔ˥˧ ʂuo˩˧ kuɔ
I he not say experiential
‘I didn’t tell him.’ (adapted from Dwyer 1995)


(2’) [Huangshui Mandarin]
kɯ˥˧ ʐɯ˨˩˧ xa tʂʰʅ˦ liɔ˥˧
dog meat [xa] eat perfective
‘The dog ate the meat.’ (adapted from Dede 2007a)


These “exotic” sentences (from a Chinese point of view) are examples of Mandarin dialects within the Amdo Sprachbund. Under intense influence of SOV languages in the region like Amdo Tibetan and Monguor, the basic word order of these Mandarin dialects has shifted to SOV.

In addition, like most other SOV languages, they have developed a range of case suffixes. For instance, the [xa] in (2) functions to mark grammatical relationships like patients, recipients, goals, and sources. Likewise, the [lia] in (6) marks the instrument involved in the action, while the [sa] in (7) expresses a motion away from something (what is otherwise known to linguists as the ablative case).

(6) 毛笔 [Xining Mandarin]
nɔ˥˧ mɔ˨˦pi˦ lia ɕie˥˧ tʂɛ
I ink.brush instrumental write progressive
‘I am writing with an ink brush.’ (Li 2002: 86) (our glosses and translation)


(7) 夜来 北京 sa 回来 [Xining Mandarin]
tʰa˦ i˨˩˧lɛ˥˧ pi˦tɕiə̃˥˧ sa tɕiɔ̃˨˦ xui˨˦lɛ
he yesterday Beijing ablative just return
‘He just came back from Beijing yesterday.’ (adapted from Dede 2007b)


Such features are clearly atypical of Chinese, but are indicative of the significant degree of restructuring, which Mandarin dialects have undergone in contact scenarios. There are numerous other examples not discussed here but interested readers are welcome to refer to our paper on this topic (Szeto et al. 2018).

Concluding remarks

A quick recap of what we’ve gone through so far. What we usually learn about Mandarin is mostly about its standardized form (“Idealized Mandarin”). Meanwhile, there are a vast array of local vernaculars in Mainland China which belong to the Mandarin dialect group (“Geographical Mandarin”).

Like all other natural languages in the world, Mandarin is susceptible to influence from its neighboring languages. Given their extensive geographical coverage, Mandarin in different regions of China are in contact with languages of different typological profiles. Unsurprisingly, Mandarin dialects display a considerable level of typological variation under such a setting. The variation within the Mandarin dialect group, however, is severely downplayed or underestimated by most Chinese dialectologists.

As language enthusiasts with good knowledge about the language in question, we were not satisfied with the received wisdom. As mentioned above, we conducted a study which arrived at a completely different conclusion. The large discrepancy between the received wisdom and our conclusion is particularly astonishing if we take into account the fact that our study is primarily based on the analysis of linguistic data published in some major works in Chinese dialectology. Apparently, we may reach radically different conclusions depending on how we analyze and interpret the data in hand. We hope our study can shed new light on the nature of Mandarin, paving the way for further studies on this fascinating and important language.


