Every second-language student of Mandarin is told pretty quickly that yī (hanzi: 一) is subject to “tone sandhi”, meaning it changes tone depending on the tone of the following syllable*. HOWEVER, the founding rules of Pinyin say that the tone sandhi should not be marked and that you should continue to write yīgè, for example, even though it’s said “yígè.”

Sometimes this “rule” feels really awkward. In this comment on Beijing Sounds, Randy Alexander took me to task for writing bùshì (as the rules would have it) instead of búshì (as it’s pronounced).

He’s not the only one. Check out this page from my daughter’s new favorite book:

Here we have yígè, yìshēng, and yí going over to some word on the next page. All sandhi are marked.

Now before Pinyin.info takes me to task, let me note that the usage on the page above violates a much more important rule: i.e. that it sep ar ates each syl la ble instead of putting syllables together into words. Well, you can’t have everything, and anyway the use of Pinyin here is not so much as a script in itself as it is just a way to get kids past the characters they don’t know.

Still, does anyone know if this is going to become the new standard for Pinyin? Or is it just a single publisher’s idiosyncrasy?


*Instantspeakchinese puts it succinctly (h/t Sinosplice, where John follows up with some other good comments on tones):

Rule 4: Rules concerning the word “yi.”

  • “The word “yi” is 1st tone when used as part of a number (yi, er, san, … shiyi).
  • The word “yi” is 4th tone when preceeding 1st, 2nd, or 3rd tones. (yi ge ren)
  • The word “yi” is 2nd tone when proceeding a 4th tone.

For some reason that’s always been hard for me to incorporate into everyday speech, and I continue to make mistakes.

  1. Sima says:

    I guess, if the pinyin is just there to support the reading of the main Chinese script, there’s some sense in marking sandhi…but it opens a right can of worms. Where does one draw the line? Do 七 and 八 get special treatment too? Do 3rd tones then need special treatment as well?

  2. Joel says:

    I’ve been undecided for a long time about whether to write pinyin according to the rules or according to pronunciation.

    If they’re going to write 一 as it’s pronounced, is there any reason to not also render 3-3 tone combinations as 2-3 (“hén jiǔ” rather than “hěn jiǔ”)? Seems they’re mixing methodologies here.

  3. First, to answer your question:

    Still, does anyone know if this is going to become the new standard for Pinyin? Or is it just a single publisher’s idiosyncrasy?

    If you’re talking about a descriptive standard (defined by predominance of usage), it’s been a standard for certainly more than ten years.

    Of course this book belongs to your daughter. Can you find any other book of hers where pinyin is written in the correct way (I had an urge to put scare quotes around “correct”), with syllables in a word strung together, and tones marked without sandhi?

    I have nearly a hundred such books in my house, and all of them have the syllables separated. The sandhi is variable. Some books have it and some books don’t, but I think (without pulling them all out and doing a survey) most books that do, limit it to 一 and 不. It would be a special thing indeed to find a book like this where the syllables were not separated.

    Second, Instantspeakchinese does not put it succinctly. They’ve left off some important points. First of all, in most cases, they’re not talking about the word “yi”, but rather the syllable “一” (without the character, it could be one of hundreds) whether or not it occurs in a word. The first bullet point could be improved as follows:

    * The syllable “一” is 1st tone when it occurs alone, as part of a number, at the end of a multi-syllabic word, and before the neutral tone.

    Hopefully that covers everything, but if other readers can improve upon it, please do; having clear rules like this certainly helps non-native speakers.

  4. Kellen Parker says:

    My first thought was what Joel said. Before I really read this post I looked at the image/pagescan and knowing I was looking for funkiness the 很久 hit me first.

    I’ve been struggling with the writing sandhi dilemma recently when it comes to Shanghainese. The tones of individual words matter a lot because the first syllable’s tone dictates the pitch accent system / tonality of the rest of the phrase, but then since the tones of every other syllable are all but ignored, it seems a bit silly to write it out for each one, so I’m instead inclined to write post-sandhi readings.

    I don’t think I have too much trouble with the 不 and 一 sandhi rules in speech, just because so much of my spoken Mandarin tends to be separated from my written Mandarin in my brain. It’s a lot more sound-mimicking so búshì might not be so definitely linked to 不是 on paper. This tends to show up when reading texts out loud as a lot of the sandhi gets lost.

  5. Syz says:

    For 3-3 tones, marking those seems pretty impractical. Seems like it’d be pretty easy to draw the sandhi-marking line at 不 and 一.

    @Sima: maybe everyone else knows this, but I’m lost… 七 and 八? Anxiously awaiting elaboration.

    @Randy: about the “descriptive standard” of marking 不 and 一 tone sandhi, maybe there’s no clear winner yet? I pulled a few books randomly and got two with and two without.

    For writing pinyin with separate syllables, there’s no contest. I’ve never seen a kid’s book written with pinyin words instead of syllables.

  6. From the Contemporary Chinese Dictionary (现代汉语词典(汉英双语)):

    The character 七 is pronounced in the high and level tone when it is used independently or at the end of a phrase or sentence, such as in 十七 shíqī, 五七 wǔqī, 一七得七 yīqīdéqī, 七夕 qīxī, 七年 qīnián, 七两 qīliǎng; and it is pronounced in the rising tone when it precedes a character that is pronounced in the falling tone, such as in 七月 [qíyuè] and 七位 [qíwèi].

    There is a similar explanation for 八.

  7. Julen says:

    * The syllable “一” is 1st tone when it occurs alone, as part of a number, at the end of a multi-syllabic word, and before the neutral tone.

    I agree with Randy that this is an essential point.

    It was very late in my learning that I realized I was doing my “yi”s wrong, because this rule is often not properly explained. For example in cases like “唯一的。。。” , “第一个” etc., we are so used to turning the 一 into a fourth tone, that if nobody tells you you can be doing it wrong for ever.

    And it is important for understanding because that First tone is prominent, native speakers are expecting to hear it and they might misunderstand you if you don’t mark it… well that is the experience I have anyway.

  8. Syz says:

    @Randy: thanks. That seems crazy — at least by Beijing standards. I just asked a couple of people in the family here about 七月 [qíyuè], because I didn’t trust my own intuition that it was qīyuè. First I got them to say 七月 in the course of conversation. Both pronounced it qīyuè. Then I asked about the qíyuè version. One said qíyuè is just wrong. The other said people who say it that way 带口音——反正不是北京话 (“saying it with an accent — not Beijing dialect”).

    Do you hear this form of sandhi in the northeast?

  9. Not sure if this was mentioned by others, but I remember at least a (semi-)official source stating that tone-sandhi marking is allowed for educational purpose. It might have been following source, but I’m not too sure: Yǐn Bīnyōng (尹斌庸), Mary Felley (傅曼丽): Chinese romanization: Pronunciation and Orthography (汉语拼音和正词法). Sinolingua, Beijing, 1990, ISBN 7-80052-148-6, ISBN 0-8351-1930-0.

  10. Duncan says:

    Hm, I thought that it wasn’t universally accepted among Chinese linguists that two third tones in a row does equate to the first character becoming second tone alone. Some suggest that the second syllable also changes to become a ‘half’ third tone’, and there is also great controversy about what happens when three third tones occur together.

    When we did a beginner’s dictionary we included glosses for ‘yi 一’ and ‘bu 不’, and marked tone sandhi in the pinyin throughout for those two, but didn’t do so for the two three’s in a row.

    Seconding that I have never heard 七月 [qíyuè] or the like before…

  11. Zev Handel says:

    Randy, Syz,

    The tone sandhi rules for 七 and 八 are described very clearly for Beijing dialect — as spoken 100 years ago. It’s possible that there are Mandarin dialects that still have these tone sandhi rules, but I don’t think any but the very oldest speakers in Beijing have it. It is not taught in Mandarin classes in the US and for good reason.

    I think the easiest way to remember when to pronounce 一 as yī and when to sandhify it is by analogy with ‘two’. Wherever you would use èr, you say yī. Wherever you would use liǎng, you use yí or yì. This of course begs the question of what the underlying rule really is. It’s easier to think about that question in terms of 二 and 兩.

  12. Do you hear this form of sandhi in the northeast?

    I guess at least 50% of the time. Probably much more. I think of it as very normal.

    I’m wondering about 三 though.

    @Zev: Interesting rule. It seems to break it down to whether 一 is used with a measure word or not.

  13. It’s tough to create a perfect rule for tone sandhi. If pinyin was written with tone sandhi, then dictionaries would be really confusing. Words starting with “一” would need to be checked in 3 different spots!

  14. Btw, I’ve found Chao to be a good resource on Sandhi in Mandarin. I have implemented simple Sandhi rules programmatically, but I kept the rule set small. It includes the variations of the fifth tone described by Chao. Not sure if anyone mentioned that already.

    Yuen Ren Chao: A Grammar of Spoken Chinese. University of California Press, Berkeley, 1968, ISBN 0-520-00219-9.

  16. Doug says:

    Does anyone have a sense of when 个 (especially in 一个) is said in the neutral tone? I have been listening to old recordings of one of my Chinese teachers who, in reading from a list in citation form, says yígè (一个) with a fourth tone on the 个, but qíge (七个) and báge (八个), the latter two with a neutral tone on the 个.

    • Steve (Syz) says:

      I don’t know any good rules offhand for when or when not 个 gets 轻声, but your description of qíge (七个) and báge (八个) is interesting. That’s a form of tone sandhi that I’ve heard once or twice (from dongbei folks, if memory serves) and have read described in texts, but it’s not current in Beijing and I’m a bit surprised that you find it in an “official” recording for learning purposes.

  17. Doug says:

    After Steve’s reply was posted, I asked a Chinese student of mine in Beijing to check with classmates from Beijing and Dongbei on differences in number sandhi in the two places. The consensus is:

    1. In both areas of China, the 一 in 一个 is always said in the second tone even though it is part of a number;
    2. The 七 in 七个 八 in 八个 are said in the first tone by younger speakers in Beijing, but in the second tone by older speakers in Beijing (thus the pronunciation of my professor who is now retired), whereas even younger speakers in Dongbei still use the second tone — not sure where in Dongbei;
    3. 个 can be said in the fourth tone if stressed or in the neutral tone if unstressed with all numbers in both areas;
    4. Producing 个 in the neutral tone has no bearing on the tone of the preceding number — sandhi still applies (if it applies) as though 个 were in the fourth tone (which phonologists whose theories allow ordering of rules would call a ‘counterbleeding’ rule order).

    • Steve (Syz) says:

      Hi Doug, these look pretty good. A couple comments:
      1) “even though it is part of a number”?
      2) My intuition is that Zev has it right in the comments above. In Beijing, maybe the very oldest speakers still do 七,八 sandhi, but basically no one else. For example, my parents in law in their 70s don’t.
      4) this is interesting. As soon as you put it out there, I realized that seems right to me (and I’m going to pay more attention now), but I’d never seen it put into an explicit rule.

      One other thing. I’ll bet this has dialect / subdialect variability, even within Beijing & Dongbei. It will be interesting to see what we turn up on Phonemica recordings

  18. Doug says:

    Thanks, Steve.

    To clarify, my comment about yi in yige being part of a number (well, the whole part unless you count the measure word as part of the number), was in reference to “Rule 4” quoted in your April 28 post. I was not sure what was meant by ‘number’ there.

    Phonemica is very cool.
    Was it inspired by the International Dialects of English Archive?

  19. Steve (Syz) says:

    I’m embarrassed to admit I hadn’t heard of the Int’l Dialects of English Archive. Very interesting! We had a number of sites that inspired us (see this post) but hadn’t come across that one. The biggest structural difference I see right now is that IDEA has a sizable reading component. With Phonemica, we’re going for 100% unscripted speech (and trying to get interesting stories, not just language samples).

    Anyway, thanks for the heads up. We’ll certainly see if we can borrow some ideas from there, and it’s an interesting source in its own right.

  20. Doug says:

    My student on the ground in Beijing did some more checking around with his friends in college from Dongbei and Beijing and found that in DB, you get 24 even in 七月份、七块钱、八月份、八块钱。(Likewise for 一, too, I would guess, though guessing is always risky with these things.

    The reason IDEA uses “Comma Gets a Cure” and an oral history/interview is that, unlike the older “Rainbow Passage” or an oral history, the “Comma” passage is designed to place every sound of English found in any dialect in rougly every relevant context, to be narrative in structure and to be somewhat dialectally neutral in terms of grammar and the lexicon. (I know this because I was its editor.) This allows linguists, speech therapists, language teachers and dialect coaches in the entertainment business to share samples and make comparisons meaningfully. If you decide to incorporate such a feature into Phonemica, I would be happy to work with you on that.

  21. Doug says:

    Also, perhaps, have a look at the Visual Accents and Dialects Archive: http://mith.umd.edu/vada/ and resources listed there.

  22. Steve (Syz) says:

    good stuff, Doug. I’ll respond offline

