Shurely shome mishtake!

If you’re reading this blog you’ve already heard that Chinese is So Damn Hard, but I sometimes wonder whether we make it more difficult than it need be.

Getting it wrong from the start

One of my clearest memories of those early baby steps in Pinyin was the strange obsession some teachers had with the difference between two groups of easily distinguished sounds:

zhi chi shi

zi ci si

Obviously, these syllables seemed pretty exotic with the so-called vowel represented by the letter ‘i‘ being nothing like any vowel I’d ever heard. But even as I started to make that fricative-cum-vowel, I was regularly reminded that the distinction between the 平舌 (píngshé, flat tongue) and 翘舌 (qiàoshé, cacuminal or retroflex, sometimes called 卷舌, juǎnshé) sounds was terribly important.

Any mention of 翘舌 would be accompanied by this sort of gesture.

Books listed minimal pairs to test the student on his ability to recognise the difference between words like 山 (shān – mountain) and 三 (sān – three)…

Obviously, for many Chinese speakers, learning this distinction is important; many dialects simply don’t employ it, but if you want to pass your 普通话考试 (Pǔtōnghuà Kǎoshì, Standard Mandarin Exam) and become, say, a teacher, you’d better be able to make these sounds.

I’m an English speaker. Where I come from, only drunks mix up those sounds.

I thought I’d pretty much discounted such distractions. I was much more worried about the distinction between 商 (shāng) and 香 (xiāng). Maybe that was where I went wrong. Producing a passable ‘xi‘ had in itself been no great problem. But I guess, at some stage I must have thought, if only I could make that sh, with my tongue curled right up, maybe then I’d get that shx distinction. After all, it’s called a retroflex.

Oh dear. Perhaps a little bit of linguistic knowledge is a dangerous thing.

However it happened, I somehow came to believe that I needed to make the sounds zh, ch, and sh way up in the roof of my mouth.

Now, I don’t think I was alone in this. I know at least some other foreign learners of Chinese find themselves exhausted by the attempt to produce these sounds, and gradually realise they’re going to have to relax a little. Maybe they even notice, as I did, that the sound produced right up there is just not very much like that of their Chinese friends. Besides, most of those friends don’t dribble that much during conversation.

So, gradually, I found a position with my tongue tip no longer turned up, but turned down a little, just above the alveolar ridge. This was easier, and it seemed to sound a little better.

Of course, this isn’t the whole story. It wasn’t that I’d started off wrong, and then got it right; it was that I’d started off completely wrong, and then managed to get it slightly less wrong.

These sounds are simply not produced that far back or that high up.

If you’re one of those who made the same mistake as me, and then adjusted a little, as I did, maybe you’re saying to yourself, “But, if I made those sounds any further forward, my tongue would be in basically the same place that it is for those English sounds!”

Uh huh…

You’re catching on.

In the next few posts, I intend to show why English speakers should simply not be taught how to pronounce the Pinyin sounds zh, ch, and sh when they start learning Chinese.

Note: I appreciate many that many foreign learners of Chinese have a mother tongue other than English and that may well affect their Chinese pronunciation quite differently. Please do feel free to share any insights below.

14 responses to “Shurely shome mishtake!”

  1. Peter says:

    Yeah, I live in Shanghai, and my pronunciation has… adapted to the region. Based on my observations, I’d say that when locals (or for that matter, anyone who isn’t from 东北) aren’t too busy mixing up their zhi-chi-shi’s and zi-ci-si’s, they’re pronouncing them exactly the same–modulo aspiration/voicing changes–as in English (tʃ tʃʰ ʃ), or at least no farther back than that.

    As for those pirates from up North, I’m not really qualified to comment. Once upon a Tianjin, my pronunciations of zhi-chi-shi were gratingly retroflex, but I’ve not been back there for some time, so I can’t really say whereabouts their tongues are in those piratey mouths of theirs.

  2. Peter says:

    Follow-up: dʒ tʃʰ ʃ in English are often accompanied by lip-rounding (whereas zhi-chi-shi, zi-ci-si, and ji-qi-xi all seem to be pretty lip-rounding-free). It would probably be a bit more productive to focus on _that_, rather than all this retroflex nonsense, when teaching pronunciation. Or, for that matter, we could focus on more important things: [y] vs [u], tones (especially in the context of an utterance, as opposed to in isolation), those tricky ji-qi-xi sounds, and picking up attractive women.

    • Sima says:

      Peter, I’m glad you’ve noticed that in Shanghai. I suspect if you returned to Tianjin now, or even came up to the Northeast, you’d note that it’s really not that different; obviously, there’s a load of mixing here of the zhi-chi-shi and zi-ci-si but, for those who make the distinction, I suspect the position of articulation just isn’t that strange.

      Good point on lip-rounding. The effects can be quite striking.

  3. hanmeng says:

    I agree 100%. It’s ridiculous to insist on a distinction that eludes many native Chinese speakers. But I’ve always been annoyed at the way most beginning textbooks have page after page of drills teaching how to pronounce pinyin. Yes, learning the tones is important, but as for the rest–it’s just not that big a deal!

  4. Sima says:

    hanmeng, the point is that English speakers arrive at learning Chinese with a perfectly serviceable distinction. There’s really no need to insist on anything; the distinction is trivially easy for them.

    I know what you mean about pinyin drills, and I certainly remember thinking, “But when am I going to start learning the language?” Actually, I think that well handled drills really are useful and important. I guess a lot comes down to how the teacher presents them.

  5. Eric says:

    I teach Chinese in the US. My first language is English. I went through a similar process as Sima, slowly moving from textbook-induced tongue-rolling back to the zh/ch/sh sounds I’d always made comfortably in English. Once I started teaching, it was easy for me to see that zh/ch/sh were not problem areas for students–in spite of hearing some other Chinese teachers insist that they were. Ideally, good teachers will know their students’ needs. They will recognize the aspects of pronunciation that need attention and those that don’t (as well as those areas which aren’t pronunciation problems per se, but errors of understanding due to using eyes instead of ears, e.g. Pinyin ‘iu’). I suspect that for many Chinese teachers in China, their students language backgrounds are so diverse that it is extremely difficult to become knowledgable about their specific needs. This is one of the handful of potential benefits that come with starting to learn a foreign language while in one’s home country. Though it’s not certain, it’s more likely to be the case that a teacher who teaches Chinese outside of China will better understand the specific needs of those learners.
    I would be interested to know if some of the zh/ch/sh descriptions that have been canonized in textbooks (‘retroflex’ ‘curl the tongue tip’) are the result of individual/regional/historical features that were more or less accurate for the speaker(s) originally describing them. I have in mind what seems to be the case with tone descriptions in textbooks. Y.R. Chao originally described the third tone (in isolation) as ‘2-1-4’ on a five tone scale. This seems to accurately capture how he produced it. But, at least in modern day Beijing, this represents an extreme that most speakers would not produce (most speakers produce 2-1-3, 2-1-2 or even 2-1-1 patterns). Still, due to Chao’s influence, most textbooks up to now continue to describe and illustrate the third tone as ‘2-1-4’ (even worse, they don’t mention that this is only the case when it’s spoken for emphasis or in isolation). Has something similar happened in the case of zh/ch/sh?

    • Sima says:


      I’m glad to hear you’ve worked this out in your teaching.

      As for why this happens, I’m going to try to explore this in the next week or two, but I suspect the genuine need many native speakers have to find these sounds, combined with the use of the term ‘retroflex’, gives many foreign learners opportunity to misunderstand. Once you have the idea in your head, it’s quite difficult to change.

      I don’t have a copy of Chao to hand, but suspect he did document the half third tone. I suspect that teaching tone combinations is something many teachers find troublesome once they get past the initial 你好 sandhi. But maybe you could suggest a better way of teaching it!

      I have Chinese books on phonology which seem to very accurately describe zh-ch-sh. I’d be most interested to know whether you have encountered any student books which actually show a sub-apical ‘full retroflex’ articulation.

      • Steve (Syz) says:

        Sima we must have overlapped in posting. Yes, I just looked in Chao. P. 27 he talks about 1/2 third tone

    • Steve (Syz) says:

      Eric, I’m wondering about the 2-1-4 issue. True, Chao describes it that way, but he also describes the 1/2 third tone as 2-1, which is accurate and (as many Chinese teachers have noted) statistically very common, which is why it might make more sense to teach beginners to “aim low” on the third tone (which is kinda what you’re saying). But that’s a separate issue.

      On the 2-1-4 issue in particular, I’m thinking the description might still be valid in China today, if only in the case where you ask a native speaker to produce a third tone in isolation. You think? This might warrant a recording or two…

      • Eric says:

        Steve, I fear I’ve moved us off course from zh/ch/sh, that wasn’t my aim. But as long as we’re talking about it…

        I was able to take a class a few years back with the Chinese phonologist Shi Feng 石锋 from Nankai University. I have on hand his book ‘Exploration of Experimental Phonology’ 《实验音系学探索》. In it he reports the findings from a study of 52 Beijingh speakers (a mix of ‘old’ and ‘new’ Beijing speakers, male and female, younger and older). When producing tones in isolation, Shi Feng found that 3rd tones were averaged at something like 2-1-2.5 across all speakers. Only a very small subset of old (>50 years old) ‘old’ Beijing speakers got close to 2-1-4, with 2-1-3 being a more representative top end. I’m not positive this is the study he had in mind, but I remember very clearly that he suggested to us that to capture standard Chinese more generally (as opposed to the Beijing variety), 2-1-2 was probably a reasonable way to present it to learners.

        I certainly agree with both of you on teaching the half third tone. In the first weeks of my classes, I simply call it the ‘low’ tone and, even in isolation, emphasize just the lowness. I find the results of this approach much better than when I started with the 2-1-(whatever) and then tried to make students forget that so they could work on the more common half third tone.

        As you point out, Chao was aware of it. and I don’t mean at all to sully his name–he was awesome. It was precisely because he was awesome that his 2-1-4 analysis became standard. As to whether teachers have or have not paid attention to the half-third tone over the years, I’m not sure. Still, by naming it the ‘half’ third tone, it makes it seem as though it is somehow lesser, a ‘marked’, less common version of the third tone. That may be defensible as a linguistic analysis, where isolated bits are often given priority, but as a pedagogical approach it’s quite misleading for learners. I have yet to see a textbook that has a picture of the half third tone, but they all have the 2-1-4 picture.

        • Steve (Syz) says:

          Eric, sorry so slow in getting back. Seems like I’m not getting email updates. These guys that run the website gotta get their act together…

          Anyway, that study sounds really cool. Had no idea it existed. Might have to get a copy of that book. It helps create a believable narrative: basically, that 2-1-4 is what people used to do (so Chao was not wrong, heaven forbid), but that they don’t really do it anymore. All the more reason to aim low on 3, right?

  6. Katie says:

    So I’m still thinking about this. My half-baked theory about people’s palate shapes is almost certainly wrong. But I’m still wondering why native speakers would so consistently mis-describe what they’re doing? Of course, part of the issue is that many (even most?) Mandarin teachers aren’t actually native speakers of a dialect that has retroflexes (or whatever they are)–so maybe they don’t trust what their own instincts. And since there’s a definite right and wrong when it comes to Putonghua, whatever the description of standard is, that’s what they’ll teach. Which brings me to another question–are Chinese kids in fact explicitly taught how to pronounce these sounds? I assume yes? And if so, how is it done?

    • Sima says:

      I don’t think the connection to palate shape is half-baked as such. I think the way we draw and describe the points of articulation, what’s actually inside our mouths and what we feel, or imagine, is there could well come into it. This might the one of several contributing factors.

      If you only make ‘flat-tongue’ sounds, and are learning to produce zh, ch, sh, *and* if you have never considered the possibility that one might curl the tongue back over to make a sub-apical articulation, then the ‘hand picture’ above probably makes perfect sense. It could be a useful way for the teacher to remind you that a given sound needs the tongue tip raised a little. On the other hand, if you already have some alveolar and post-alveolar sounds, *and* you’re aware that some languages use sub-apical retroflex articulation, then that little hand signal could be giving you all the wrong signals.

      You’ll note that Chinese phonetics seems to treat the part of the tongue as the key marker, whereas in English we give priority to the part of the palate. 舌根音 seems to be the usual category for g, k, (h), ng, which would normally be labeled ‘velar’ in English.

      There are a number of interesting issues here about translation, and communication in general. There might be a real gap between what you say and I hear, when our bodies, experiences and categories are different. Especially when we forget how subjective these things are.

      I’m not even sure now that many native speakers are mis-describing what they’re doing. Maybe quite a few foreign learners are misunderstanding what they’re being told! I did a Chinese phonetics course a few years back, taught by a pretty good teacher, using a perfectly serviceable book from 北京语言大学出版社 which contained excellent diagrams from the 普通话发音图谱, mentioned in the 3Q post. At the end of the course, it had still not sunk in that I had the wrong idea about zh, ch, sh, r! Looking back at the book now, it’s perfectly clear. Yet, at the time, I didn’t even see anything which conflicted with my then-assured views.

      We must track down some primary school teachers or, better still, teacher trainers, and see what we can find out about how, or whether, children are normally taught these things.

  7. Marcel says:

    In the first sentence in this article you say “If you’re reading this blog you’ve already heard that Chinese is So Damn Hard, but I sometimes wonder whether we make it more difficult than it need be.”

    I do agree that Chinese is kinda difficult, but I think it is also important to discuss what difficult really means. If difficult means “time-consuming”, I definitely agree. But if difficult means that you have to be very intelligent to master this language, I would rather disagree. First of all the grammar is not that difficult at all and in comparison with japanese for instance, I tend to say that Chinese is easier. The Japanese Grammar is really hard…at the beginning I was really struggling…and then you also got all those honorative forms..that’s really hard as well.

    My Conclusion: Chinese is definitely not easy but if you can manage to keep up the long-time motivation it is definitely possible to reach a pretty good level…Moreover you should have some passion for the chinese language (and probably also China) as well..that surely helps !!!

    Best wishes from Switzerland


Leave a Reply