As you get so amped up for Children’s Day that you can’t sleep, I recommend you while away the wee hours at 语音字典 / Voicedic and then let me know what you think. I just came across it via PKUCN (the language discussion board that we mention on Sinoglot occasionally) and am intrigued.

Caveat: I’m not entirely sure how you’d use it. It offers per-Chinese-character audio in seven “dialects”:


First of all, I was having a bit of trouble with the divisions, but maybe I’ve got it now:

  1. 普通话, Standard Mandarin [OK, everyone gets this]
  2. 广州话, Cantonese
  3. 闽南话, South Fujian / Minnan [Considered one of the big 方言 / fāngyán / dialanguages of Sinitic. Might refer to a group of distinct dialects but in practice, according to Wikipedia, “usually refers to Hokkien”]
  4. 潮州话, Chaozhou language [to quote this researcher, “Chaozhou, a distinctive dialect of the southern Min group” — but I’m not clear how mutually unintelligible it might be with Minnan. Man, the Sinoglot round table really needs to seat some southern-based linguists.]
  5. 苏州话, the Suzhou dialect of Wu [one of Sinoglot’s focal languages]
  6. 标准赣语, “standard” Gan language, i.e. the language of 江西省 (Jiāngxī province) through which the Gan river flows. [This is also a major 方言 / fāngyán]
  7. 上海话, another subdivision of Wu

In terms of capturing the variety within “Chinese”, the selection’s a bit random, missing some major Fangyan and covering subdivisions of others that are going to have a lot in common.

The second issue is it’s hard to imagine how far per-character pronunciation is going to get you in understanding a language / fangyan that is not standard Mandarin. But if the data’s reliable* — which I need other folks’ opinions on — Voicedic might be a nice addition to the fangyan tool portfolio.


20 responses to “Voicedic — syllable pronunciation in 7 versions of Chinese”

  1. The Shanghainese is pretty good. Decent quality audio. Beware the Suzhou hua. Terrible audio and inconsistent in many regards.

    Suzhou makes some sense because it’s the old prestige dialect of Wu. Shanghainese makes sense because it’s what most people speak now. Can’t account for the rest.

  2. Syz says:

    Kellen: how mutually unintelligible are Suzhou & Shanghainese Wu?

  3. Daan says:

    Nice! I’m not sure which variety of 閩南話 they use, but it’s not the Taiwanese one, as far as I can tell. It’s close, though. But I can’t speak Taiwanese Min very well, so I may be wrong.

    For 普通話, 輕聲 syllables such as de and zhe are shown, but perhaps not surprisingly, no recordings are available. And when you look up 事兒, you get two separate syllables: shì’ér. Also, and this may just be my tonal deafness, the recording you hear when you look up 一 sounds like a second tone to me?

  4. Daan says:

    Hmm. Can anyone here who speaks Cantonese check out the recording for 香? I looked up 香港 and it seems to me her pronunciation of 香 in particular is a bit unnatural.

  5. Syz says:

    @Daan, I know nothing about Cantonese, but you should listen to the pu3 in putonghua. It’s like being on a roller coaster. I didn’t mention it above cuz I didn’t want to be too critical on first encounter 😀

  6. Syz: tones are hugely different, Suzhou having 8 and Shanghai having more of a pitch accent system than tones. They’re in different dialect families. I’d say a Suzhou speaker could manage to get the gist of Shanghainese but both people in the conversation would have to work at communicating.

  7. Cantonese sounds OK to me judging from my ~100 words. I tried 香港、普通话、多谢、大学。

  8. The English part (en.voicedic.com) seems to be a synthesis.

  9. Ho Sun Yan says:

    @Daan: She sure gets up very high for the high level tone of Cantonese, higher than speakers usually would in natural speech, but I suppose this is for pedagogical reasons. All things considered, the Cantonese recordings on Voicedic seem pretty good.

  10. Ty Eng Lim says:

    The reasons for differentiation between South Fujian/Minnan and Chaozhou are because they are mostly unintelligible, well at least the versions that are most often considered standard (Amoy/Xiamen for Minnan, and Shantou for Chaozhou). Also there are huge diasporic populations for each minnan dialect (Chaozhou is actually a dialect of Minnan, but most people think of Hokkien when they hear Minnan). I am a native speaker of 潮州話. The Chaozhou audio is Chaozhou standard (not 汕頭). The pronunciation is good, though, in 潮州話 (aka: Teochew), multiple readings for characters are quite common. For example: 上 zien6 is only one way to pronounce it. It can also be pronounced siang6, and a few other ways I believe. So the usefulness is somewhat limited, though certainly still good to have.

  11. Syz says:

    Ty Eng Lim, thanks for the explanation. Regarding “multiple readings for characters”: is there somehow more of an issue with 多音字 in 潮州话 than there is in 普通话? Or are you just saying that’s a limitation of the character-based approach?

  12. Daan says:

    Ty Eng Lim, thanks for your comment. I’d been wondering about its recordings for a while. So it’s standard 潮州 閩南話. And Syz, I’m not familiar with 潮州话 myself, but I imagine that’s because phonological change is not constrained by the script? :)

  13. Ty Eng Lim says:

    To my knowledge, the reason Teochew has so many 多音字 is because of natural phonological change but also from “natural” interference from the national Chinese languages of the past. A good example of this actually comes from Hokkien. There are two ways to count from 1-10. One version sounds very similar to Cantonese which reflects the language of the Tang dynasty, another retains aspects of old/middle Chinese. I don’t actually speak Hokkien, so many a speaker can confirm this. You can get the gist of it from this page: http://www.learnhokkien.com/lesson_05.htm

  14. Zev Handel says:

    @Syz, Most Chinese languages are full of what can be called doublets: morphemes with related (or identical) meanings and related pronunciations that are written with the same character. Mandarin has fewer of these than other Chinese languages, but still has a fair number. The Min languages are probably the most notorious for this phenomenon. As Ty Eng Lim says, the main reason has to do with influence on southern dialects from northern prestige dialects, often as manifested in reading traditions. The “proper” way to read Classical Chinese, i.e. to pronounce written characters aloud, was taught in schools and was often based on imitations of northern prestige pronunciations. Many of these pronunciations, especially when embedded in literary compound words, then ultimately made their way into the colloquial language, where they existed side-by-side with the morphemes that had undergone different processes of sound change within the natural history of the language.

    I’ll give you an example from Mandarin. When the ru-tone was lost in the Beijing area, and the -k ending changed first to glottal stop (as in modern Wu) and then disappeared altogether, the natural phonological development was diphthongization of the vowel. Thus we get words like 白 bái ‘white’ from earlier pak [these early forms are just rough approximations, don’t take the vowels too seriously], 得 děi ‘must’ from earlier tək, 薄 báo ‘thin‘ from earlier pɔk, etc. At the same time, a reading tradition developed that was taught in schools, which said you should pronounce these characters closer to the way they sounded in higher-prestige southern forms of Mandarin (like Nanjing), thus: 白 bó ‘white’, 得 dé ‘obtain’, 薄 bó ‘thin‘. Many of these reading pronunciations disappeared, but quite a few entered the spoken language, in some cases supplanting the colloquial pronunciation, but in some cases existing side-by-side with it (often in different contexts).

    Multiply this phenomenon a hundred-fold and you get the situation in Southern Min dialects.

  15. Syz says:

    @Zev: so “prestige” is loaded with irony, if I’m understanding right. Where for many characters you might have a southern language looking to the north for a “proper” classical pronunciation, in the case of 白, 得, and 薄 you have the north looking to the south! And neither pronunciation is exactly the original, of course. Very funny.

  16. Zev Handel says:

    Well, what is considered prestigious varies with time and social circumstance, of course. And the terms “north” and “south” are quite vague and have many different referents. In the context of the Sinitic languages as a whole, ee tend to use them most often to refer to Mandarin (north) vs. non-Mandarin (south), but in the context of the Mandarin family (and in my note above), this refers to the Mandarin-speaking area around the Yangtze. Through the Ming and into the mid-Qing was a period when these southern Mandarin pronunciations were considered prestigious, and Beijing pronunciations even came to be seen as bastardized, influenced as it was by non-Chinese northern peoples (i.e. barbarians). So in that context, “southern” does not mean Cantonese!

    South Coblin has written quite a few interesting papers (building on the work of Chinese scholars like Lu Guoyao) on the role of the two competing guanhua koines, “běiyīn” and “nányīn”, in the Ming period, and on the gradual shift of prestige from Nanjing to Beijing in the middle of the 19th century.

  17. Zev Handel says:

    So to clarify, I don’t think “prestige” is ironic. But it’s not constant.

  18. Syz says:

    Thanks, Zev, I get it. You’re right: I was getting my north/south referents mixed together — not really so ironic.

    South Coblin? Wow. For the record, babynamewizard.com doesn’t show ANY results for South as a given name, even in an advanced search, but there he is, at the University of Iowa

  19. Zev Handel says:

    South is actually his middle name (he published under “W. South Coblin”), but it’s what he goes by. It is an interesting name, I’m sure there’s a story behind it. By the way, he’s an excellent scholar: meticulous, logical, and thoughtful.

  20. Kellen says:

    Fascinating stuff guys. It’s always an educational experience, Zev.

