"A demographic dictionary of Modern Chinese"

Some wandering around led me back to an old Language Log post that mentioned a site that shows a state by state comparison of English search terms.  It turns out that it does it for Chinese province by province too!

The site is called Lexicalist.  I didn’t spend too much time with it because I know that readers will come up with more interesting, less obvious searches than mine.

Here is 玩儿:


You can clearly see that it’s not too popular to write that in the South, and that the most popular place is in Tianjin/Beijing.

Then also have gender demographics:


So women 玩儿 more than men?

Happy International Women’s Day, by the way.

11 responses to “"A demographic dictionary of Modern Chinese"”

  1. Julen says:

    Thanks for the link, it IS really an amazing tool, supposing it is exact.

    I just tried a word that I have been curious about for some time “晓得“ (xiao3de2), which is the most common way to say “to know” in Shanghainese, and also exists with the same meaning in putonghua (although less used).Here’s the link

    It is very interesting that it is not at all in Shanghai, but rather in Sichuan, where this word is most used. And looking at the map and the numbers, there is a clear trail West to East across the middle of China where this word fades down all the way to Shanghai.

    I interpret this as evidence of a phenomenon that has been often observed by me and others: that those regions speaking dialects very close to mandarin (like Sichuanese) tend to mix this into putonghua a lot more than those who speak relatively distinct languages (like Wu, etc).

    Interestingly, this makes the mandarin in some Northen regions more difficult to understand than the mandarin spoken by Southern people (supposing they can actually speak mandarin, of course). And indeed, my observations are consistent the Lexicalist map: I rarely hear people in Sh using “晓得” when speaking mandarin, whereas they use it all the time in Shanghainese.

  2. Julen says:

    One more interesting try:

    讨厌! – 70% female users
    他妈的! – 66% male users.

    The male/female stats sound right to me here…

  3. No legend on the map – what do the colours mean?

  4. Never mind, I found my answer. Wish the map would actually tell us to click on the image for the stats.

  5. Dan says:

    晓得 is very common in Chongqing and Sichuan.

  6. Dan says:

    I’ve tried looking up 齉 because some of my Chinese friends say they never use that word and I was wondering if this was just a regional thing or 齉 is not used at all.
    Anybody know more about 齉?

  7. @Julen (i):
    曉得 (晓得, hew tak) is common in Cantonese, especially among people older than 50 years old. Here in Hong Kong, I’d say four-fifths of people would use “sik tak” and it’s only the older folks who would once in a while use “hew tak.”

    Any Cantonese speaker will know ‘hew tak’ is from Toishan in Guangdong, like “aieeyah” from Xiamen – so I’m surprised that Sichuan gets high ratings for its use. But then again, the Sichuanese and Hunanese are famous for disguising their speech in front of outsiders (Chinese and non-Chinese alike) with other regional accents/dialects and expressions (if they happen to know them). You just need to have grown up with older Chinese folks to really appreciate this eccentricity with the Sichuanese and Hunanese.

    ‘Hew tak’ is quite a bit more than mere ‘to know’ – it’s closer in flavour to ‘to be hip to’ or ‘to be savvy about’. Actually, it’s probably closest in meaning and flavour to the Welsh English ‘he’s tidy to the ropes’.

  8. Julen says:

    @nakedlistener – in Wu it just means to know, no particular connotation. And I think the same is true for Sichuanese (from sichuanese people I heard speaking in TV).

    Just to clarify my first comment: What I meant is that the word 晓得 is surely common to all the languages across this West-east axis: Wu, Gan, Xiang, Sichuanese mandarin. The reason it fades East to West in the map is that the western side of the axis has dialects closer to standard mandarin, therefore they tend to mix more dialect words like 晓得 into their mandarin.

    This is consistent with the Cantonese situation you describe. 晓得 is also a common word there, but Cantonese being an even more distinct language from mandarin, the speakers would never use this word when they are trying to speak/write mandarin.

    • Curt says:

      I was curious about this myself (which is how I found this thread) and I took my investigation a step further: 西游记, the language of which is, presumably, the Nanjing Mandarin that functioned as the koine of the Ming era. 晓得 is used more than 200 times, and 知道 is only used about 10 times. Perhaps 知道 is the “dialect word”, and it has filtered out from Beijing and into the local language of many southern dialects in the last 60.

      Or, another possibility from the theory of words flowing out like a wave from influential areas (the classic case being how you can find many of the same words at areas equidistant from Kyoto, such that the dialects at opposite ends of Japan have many words in common that are not used elsewhere), 晓得 is a word popularized in Nanjing and didn’t fully replace 知道 in far away areas like Beijing and Guangdong. (Sichuan Mandarin’s closest relative is 江淮 Mandarin)

  9. Chris Waugh says:

    But is 晓得 a “dialect word”? I seem to recall it cropping up a lot in Lao She’s 《骆驼祥子》 i.e. ROC-period Beiping.

  10. Carmel James says:

    This looks good, but I just tried about 20 very common Chinese words only one didn’t give me this reply: We don’t have any current demographics for that word – try another! I even clicked on most of the words that are trending at the moment.
    I wonder if there is a technical problem with the site at the moment.

Leave a Reply