As you probably know, I’ve been busy lately with another project called Phonemica or 乡音苑. To quote the description from the project page:
Phonemica is a project to record spoken stories in every one of the thousands of varieties of Chinese in order to preserve both stories and language for future generations. We are a team of volunteers working within China and abroad.
Our mission: Bringing the richness of oral Chinese to a wider audience, through the words of natural storytellers, from every corner of the world where Chinese is spoken.
Now Phonemica is raising funds through an Indiegogo campaign that runs through June 9. Contributions will cover some new hardware, hosting, and other costs for the coming year. We need your help to keep it going and to continue being able to provide recordings of spoken Wu. How do we need help?
1. Financially, if you can swing it. We’d love if you could donate to the project fundraiser.
2. Helping us spread the word. If you know anyone who might be interested in the project, please let them know.
The Shanghainese IPA tool is back up, but with a caveat: The data is not guaranteed to be accurate. The current data set is taken from various resources, and then applied to and extrapolated from the 广韵 rime tables. As such some characters may not return accurate readings. The data will be updated, but likely not until this summer.
I’m super thrilled to announce that we now have recordings of Gaochun Wu at Phonemica. Gaochun dialect was my very first experience with Wu, and it’s basically what got me started down the very specific path I’ve been travelling for the past 5 years.
Gaochun 高淳 is a dialect in the Xuanzhou 宣州 dialect group. Xuanzhou is most significantly the dialect family of Wu spoken in Anhui 安徽 province, with Gaochun (on the outskirts of Nanjing 南京) being one of the few dialects in the group that are spoken in Jiangsu 江苏. If any Wu dialect could be classified as threatened, Gaochun and related Xuanzhou dialects would be it.
This group of dialects is a pretty good example of Mandarinisation (官话化), with clear changes from one generation to the next. With a much smaller speaker base than Suzhou or even Ningbo, there’s less potential resistance to these sorts of changes.
Head over to Phonemica and have a listen. It may be one of the few place on the web you can find such recordings of the dialect.
Many thanks to Claire in helping me get the recordings, and for introducing me to the dialect back in 2007.
The full version of my introduction to Wu is now up over at Phonemica. It’s the section I’d posted here before, plus a lot more. The writeup provides a quick introduction to phonology, syntax, tones and tone sandhi, and some vocabulary, all with examples.
Head over and take a look. And be sure to let us know if you’d be interested in helping with the overall project. There’s a lot to do and we’re always looking for help.
Just a quick note to say that I’ve been working on a couple small side projects, one of which involves re-approaching the Moka Garden Embroidery Mission publications, as well as trying to track down some of the sister-school publications from the era and area.
I’ve also managed to dig up a number of passport applications, consular registration papers, US census reports and alumni newsletters from which I’ve been able to populate a pretty clear timeline of the lives of those involved in the mission, in particular that of Frances Burkhead who for many years was the superintendent of the Moka Garden mission. I was pleased to learn she lived to the age of 85, and ended her life in the same town where it began. More on all that later.
CNY is fast upon us, and I’m hoping that in addition to a little bit of domestic travel I’ll be able to get a lot done with Phonemica as well as working on a number of such side projects as Moka Garden.
The following is from the tone sandhi section of a writeup on Wu I’ve been working on for Phonemica. It’s a draft of a single section. The full version will appear on Phonemica in the near future. I’ve decided to post it here in its current form in case it proves useful to have a clearer explanation than some of the other sources on the topic.
Wu dialects typically have 7 or 8 tones which follow the traditional system of four tones (ping, shang, qu, ru) with two registers (yin and yang). Tone sandhi — the way in which tones interact with eachother — is remarkable in a number of dialects, most notably that spoken in urban Shanghai.
In Mandarin, tone sandhi is limited to a few specific combinations, such as when two dipping tones becoming a rising followed by a dipping tones, e.g. 老虎 lǎohǔ becomes láohǔ. However in dialects of Wu, specifically in the northern Taihu dialects which we’ll look at here, the tones follow a set pitch contour that runs throughout a whole multi-syllabic word or phrase. This contour can be determined in one of two ways.
First, for multisyllabic words, the contour is determined by the first syllable and follows a pattern based on the number of syllables in the word. The following is an example from the Changzhou Taihu dialect. Two four-syllable words are given below, the second having different syllable-level tones than the first. The numbers correspond to the tones of the characters in isolation, 1 being low and 5 being high, thus 24 indicates a rising tone while 55 is a high level tone.
大清老早 → 大24 清55 老45 早45
動手動腳 → 動24 手45 動24 腳55
Since it is the first or left-most syllable that determines the pattern, we say that word-based sandhi is left-prominent. In these cases, That is, the tones of the other syllables in the word are ignored in favour of those assigned by the phrase contour called upon by the first. Since both of these examples begin with a syllable having a mid-rising (24) tone contour, and since both are 4 syllables in length, the resulting contour for both phrases should be the same after sandhi changes, which are as follows:
大24 清55 老45 早45 → 大21 清21 老44 早21
動24 手45 動24 腳55 → 動21 手21 動44 腳21
Despite these two examples having different underlying tones, the left-prominent sandhi system assigns both words the same surface contours since they share a tone on the first syllable. In the dialect of Changzhou, a four-syllable word beginning in a yang-qu tone (the above 24 tone) will always result in a overall contour like that above, with the stress falling on the third syllable. For this reason, it is often said that Northern Wu isn’t a tonal language in the typical sense, but rather should be considered a pitch accent system like some dialects of Japanese and Korean. Of course, different Wu dialects have different ways of handling the tones. In most Northern Wu dialects, however, we should expect a system like that outlined above.
As mentioned above, for multi-syllabic words, the sandhi system is referred to as left-prominent. That is, only the left-most syllable matters for the overall contour. However for bi-syllabic phrases which themselves do not make up single words, the sandhi rules are different. In these multi-word phrases, the system is right-prominent. That is, the right syllable retains its original tone, while the left syllable is neutralised within its register (yin or yang, as mentioned above). Looking at and example from urban Shanghainese, we have the phrase 讀書 /dɤ sɿ/. These two characters have two different readings: to read a book and to study. Since to read a book is a phrase, it would have a different tone contour than the same syllables meaning to study, the latter being a single word in Shanghainese. The phrase would have application tone sandhi as follows based on word-based left-prominent sandhi and phrase-based right-prominent sandhi:
to read a book (right-prominent phrase)
讀12書53 → 讀22書53 / 12 53 → 22 53
to study (left-prominent word)
讀12書53 → 讀11書23 / 12 53 → 11 23
In the first example, to read a book, The tone on 讀 is 24 in isolation, however it is an entering tone and so it gets neutralised to 22. 書 retains its original tone of 53 because it’s the prominent word in the phrase. In the second example, to study functions as a single word, so that the tonal curve of the whole word is determined by 讀. Like the four-syllable example above, two-syllable words also have set contours, and the contour for such sentences beginning with an entering-tone syllable is 11.23. Thus, in the example of to study, the tone on both syllables is modified from the isolated tone, however it’s happening in a pattern determined by the first syllable.
In addition to the set word-level contours, there are set values for tone neutralisation in phrase-level sandhi. Specifically, yin tones neutralise to 44 while yang tones neutralise to 33. The exception is for the ru class of tones, in which case the tone is neutralised to 22 regardless of register..
This again is common in Northern Wu dialects, though each dialect will follow a set of sandhi rules unique to that dialect. For that reason we won’t go into more detail here.
朱曉農 Zhu Xiaonong，A Grammar of Shanghai Wu，Lincom Europa，München，2006
As part of my efforts to improve the accuracy and usability of the Shanghainese phonetic data set, I’m going through and running parallel collections for Suzhou and Changzhou. This is all being done by filling out a copy of 方言调查字表 for each, putting that data into an online database and then applying it to a second database, itself based roughly on the 廣韻, but with revisions. If you look for more than 2 seconds on Google you can find a pdf of 方言调查字表, however the retail price for a nice clean published version is 16RMB or anywhere from NT$40 to NT$70 on Taiwan. That store to which I linked just now is a great place to buy Mainland-published books in Taiwan, by the way.
This approach has yielded some interesting discoveries. For starter, there’s a lot of evidence for plain old borrowing from Mandarin, tone class and all, for a number of words that otherwise should be entering tones, about which I wrote fairly recently. Beyond that, it’s also good to see, side by side, pronunciations for common words in Suzhou, Changzhou and Shanghai. That brings me to another excellent book for fangyan research/comparison, 汉语方言字汇, edited by 王福堂. It’s basically 方言调查字表, filled out for twenty different dialects including Suzhou and Wenzhou. While I’m relying more on 汪平’s Suzhou dialect dictionary, it’ll surely be useful to fill in some inevitable holes. I’m hesitant to include Wenzhou too much, as in general it’s far removed from Taihu dialects like Changzhou, Suzhou and Shanghai, and at this stage it’ll be too likely to influence my judgements of what’s been borrowed versus what’s a natural phonetic divergence. Still, a great book for a general overview.
Beware shopping on Kongfz.com. While I’ve had great luck with them in the past for buying antique books, shipped from Jiangsu to Jiangsu, I’m a little weary about whether or not my latest purchase will actually make it to Ilha Formosa as it should.
I’ve working with some data on Changzhou Wu these days. It’s interesting because aside from
the merger of 上 tones into one the redistribution of some shang tones, Changzhou preserves the rest of the 8 tones*. This is true for most Northern Wu dialects, including some Pudong varieties of Shanghainese which has otherwise merged itself into 5 tones which are mostly disregarded anyway.
In an oversimplification of the relationship between dialects, we can pretty much say that two dialects of two different Sinitic languages (ignoring Mandarin and maybe Min as well for different but comparably significant reasons) which preserve the two registers of the four tones, what is a yang ru tone in one dialect will be a yang ru in the other. What’s more, an entering tone will end in p,t,k in Cantonese, p or k in Korean and -ʔ in Shanghainese or Changzhou dialect.
Except when it doesn’t. It’s an oversimplification because language contact is a thing, and words get borrowed from neighbouring dialects of dissimilar languages, thus /ŋ/ is quickly becoming /ʋu/ or /wu/ along the shores of the Yangtze.
Two cases have come up in my recent work that show this, but in a somewhat baffling way.
1) 昨 zuó is jok3 in Cantonese, 작 jak in Korean, tạc in Vietnamese, and ought to be zɔʔ8 in Changzhou, but instead it’s zo2, yang ping, corresponding to Mandarin’s zuó, also yang ping.
2) 幕 mù is mok6 in Cantonese, 막 mak in Korean, mạc in Vietnamese, and I’d expect it to be mɔʔ8 in Changzhou but actually it’s mɤʊ6, yang qu, which also corresponds to the tone of the syllable in Mandarin, yin and yang having merged into what is now Mandarin’s fourth tone.
I don’t have an answer, except to speculate that it is exactly what I mentioned before: These have been borrowed from across the River. The borrowing preserved the tone, and in the case of the qu sheng, assigned yin/yang based on voicing. I hope this is what happened, because it’s downright fascinating if that’s the case.
If anyone has some insight into this I’d love to hear it.
- – -
* more on shang distribution in a future post
Just a quick update to say that the phonetic corpus, as is, has a number of errors that need to be corrected, for reasons mentioned in the previous post. More than that, I’m trying to get a much more stable, accurate and comprehensive version put together so that it can be made available for public consumption.
I’m also in the process of building a parallel Suzhou phonetic corpus. This is all quite time consuming, and it’s being done in addition to my other grad work. I’ll have it up as soon as I can, with progress reports in the mean time.
It’s enough to make you pull your hair out. You’re looking for the pronunciation of a single character which should not be a 破音词. It’s a simple one with a single meaning. It’s the character 多, this time.
You pull out your handy dictionary and check the index, which tells you the entry you want it on page 248… and 290. That’s ok though; lots of entries are duplicated since the dictionary is organised by category, not by stroke or pronunciation.
Flipping to page 248 you find /tu/. Sounds right. Checking page 290 to be sure, you find… /tɑ/. Hmm. The note says the latter is 代词, and that the reading has been held over from a much earlier pronunciation. You’ve just added a layer. Specifically, you can not count on being able to convert 多 to any transcription without knowing the context and usage. That means your parsing has to be that much more on-the-ball. Or maybe it’s worse than that, and your entire understanding of the situation is off.
It’s easy when dealing with dialects to get frustrated. It’s especially easy if you have any expectation of things being systematic. To summarise a pretty clear expert on the topic, “every dialect in China is a creole”. It’s not that Spanish and Italian evolved from Latin but on different routes. It’s more like, that happened, but with lots of borrowing from French and Arabic, and from each other in not-so-predictable ways along the way. So 五 is /ŋ/ until /ʋʊ/ is borrowed from neighbouring Mandarin dialects and then /wu/ is borrowed a little bit later.
Language contact has always been rampant and things like Hangzhou dialect with its substantial influence from Song immigrants is not so much an exception as it is a more obvious example of the rule.
It’s not enough to apply sound change rules to Mandarin and expect to get Wu, or even to get an interesting dialect of Mandarin (连云港话 anyone?). Since pretty much all digital setups are based on Mandarin, it pretty much means you have to start from scratch to make a system that’s natively comfortable with Wu, knowing when character X is pronounced Y and when Z, and it’s not going to agree with Mandarin.
It is frustrating. And it’s time consuming. But it’s the reality. Lots of the work has been done. The only thing that hasn’t is getting it all online in a way that it can be combined and utilised in the best way possible.