Pinghua romanization

Traveling in Guangxi, digging a little bit into exotic* Binyang Dialect while taking in the scenery (Toto, I have a feeling we’re not in Beijing anymore). I’ve done some cursory searching online but failed to find a romanization for Binyanghua, so I thought it would be fun to try making one myself and put the recordings up for the listening pleasure of anyone else who loves a good voiceless alveolar lateral fricative aka ‘voiceless el’ [ɬ]. Who knew a Sinitic language would have consonant phonology in common with frickin’ [forgive the pun] Welsh!

The goal is utilitarian: I’d like to be able to hear a word and write it down with confidence that I’ve got the basic phonemes right, including the phonemic tones.

How does one do a romanization? Unfortunately I have no academic background here, but I believe the following would be classified as the Empirical Brute Force method. Can’t say how well it’s gonna work, but least it’s a starting point. Got ideas about how to proceed with analysis, samples you’d like to hear, or references I could look into? I’d love feedback!

BYHR = my attempt at a BinYangHua Romanization.

WARNING: This post is just a starting point, and what follows in the numbered sections is more or less a chronological exploration. The BYHR in the first sections is full of inaccuracies and inconsistencies. As I work my way through subsequent sections, I’m revising my hypotheses about what sounds and tones are phonemic. If you want to be boring and skip all the hemming and hawing, you can go to the end of the post to read the running hypotheses. I will try to follow up with future posts, but my time is short and it’s better not to make promises when your previous post was, oh, about two years ago.

Sample 1: “I’m drinking water”

“I’m drinking water” 我正在喝水
OK, this sounds straightforward enough. Not really that far off Mandarin. I’ll try breaking it down syllable by syllable.

Mandarin recording BYHR Notes
wei22 For the record, at this point I’m just kind of winging it on tones, using tone numerals 1-5 where 1 is lowest pitch and 5 is highest.
 sei33 Is it possible there’s a glottal stop at the end here? BTW, pretty sure there’s no s/sh distinction as we have in standard Mandarin, so just using /s/



Sample 2: “Drink it down in one gulp”

“Drink it down in one gulp” 一口喝下去
Yikes. This is sounding a bit more exotic now. Is that something like /bl/ in the fourth syllable? It’s harder to divide the syllables this time, but here’s my attempt.

Mandarin recording BYHR Notes
yet33 It sounds to me like there’s a stop at the end of this syllable that seems to get sort of assimilated into the /h/ of the next syllable. Similar to the assimilation of the ‘t’ in /hat sei/ above.
hou33 /ou/ not quite the same sound as /ou/ in Mandarin, but I’ll ignore that for now.
hep33 Hmm. I’m writing /p/ at the end of this syllable even though it sounds voiced. Guessing it’s just an assimilation because of the /l/ in the next syllable.
leok11 Not a vowel sound I’m familiar with. From other conversations I gather there’s a /k/ stop at the end of the syllable, although it’s not very noticeable here. And the associated hanzi would be 落 instead of 下 as in Mandarin.




Sample 3: One thru Twenty

As long as we seem to have a number (yet = 1) in the sentence above, let’s try listening to a bunch of numbers.


Mandarin recording BYHR Notes
ni52 the /n/ sounds more like [ŋ] than [n], but if there’s no phonemic significance, I’ll just write it as /n/.
hlam24 There’s our first voiceless alveolar lateral fricative [ɬ]. To keep it simple, I’ll try just using ‘hl’ unless the notation ends up looking problematic.
nou21 Again sounds [ŋ]. To my ear, the tone here sounds consciously descending, but low-descending rather than the high-descending of ni52 above
十一  sep11yet33 Phonemes seem right. Tone of yet is definitely higher than sep, but not super 55 high as marked above.
十二 sep22ni31
十三  sep11hlam13
十四 sep11hlei33
十五  sep22nou21  /p/ seems to get assimilated. Nou is still definitely descending, which might require then that sep start a little higher.
十六  sep22lok11 Again not sure of tones. Lok is lower, but it sounds to me like it doesn’t descend in the way that nou does.
十七 sep11cet33
十八 sep11bat22
十九 sep11jiou22
二十  ni42sep11



Interesting. Before starting this numbers exercise I hadn’t thought about how useful numbers might be for understanding the tonal system. I’ll come back to this.


Sample 4: “Make / wrap zongzi”

“Make / wrap zongzi” 包粽子 [supposedly aka “sticky rice dumplings” — good stuff]


Mandarin recording BYHR Notes
 beo13 Is that the same /eo/ sound as in leok above?
zon44 First encounter so far with /n/ at the end of the word. As noted above with ni, it sounds more like [ŋ] than [n], but if there’s no phonemic significance, ‘n’ should be good enough.



Sample 5: “Binyang’s ‘fire cracker dragons’ are really famous”

“Binyang’s ‘fire cracker dragons’ are really famous” 宾阳炮龙很有名 [And pretty cool too. Do an images search for 炮龙]


Mandarin recording BYHR Notes
宾阳 ben44yein11 First /ein/ we’ve had.
炮龙 peo44lon11
很有名 hen44you22mek11 Honestly I don’t hear that /k/ at the end. But my informant assures me it’s there. Maybe it’s interference from Mandarin, maybe it’s just a weird recording.



Sample 6. Days of week


Mandarin recording BYHR Notes
星期一  hlen33gei11yet55 Actually, the first time I went through these I misheard hlen as ‘sen’ — such is the influence of one’s dominant phonemic system.
星期二 hlen33gei11ni42
星期三 hlen33gei11hlam24
星期四  hlen33gei11hlei55
星期五  hlen33gei11nou21
星期六  hlen33gei11lok21 Can’t really figure out the tone on 6. Sounds like 21 in this case, but previously sounded more like 11.
星期日 hlen33gei11net21 Gonna need more samples of 日 to feel confident about that net. Is the [ŋ] at the beginning doing something funny to the vowel, or is it not the same as /et/ in previous words? Sounds like Russian Nyet to me.




 Sample 7. “Today is Sunday”

“Today is Sunday” 今天是星期日


Mandarin recording BYHR Notes
 今天  gam33net21 So apparently this is 今日 rather than 今天.
星期日 hlen33gei11net21




 Sample 8. “My hometown is Binyang”

“My hometown is Binyang” 我老家在宾阳


Mandarin recording BYHR Notes
 我 nou44 or 55?
leo42 Tones are hard to pin down — falling, anyway.
za24 Sounds a bit like a /t/ stop at the end, but that’s just the influence from the next word, zai.
宾阳 ben33yein11 Yein sounds sort of creaky voice like a good solid 3rd tone in Mandarin, no? Not sure if 11 is the right description…



Sample 9: “I’m playing guitar”

“I’m playing guitar” 我正在弹吉他

Mandarin recording BYHR Notes
? Clearly this doesn’t sound the same as 我 above. I even asked about it, but my BYH speaker says it’s insignificant. Just one of the vagaries of speech production — just gonna let it slide.
dan212 Tell me that voice isn’t bottoming out! Even lower than the previous word, zai. Significance TBD.


Mostly I liked this sentence because of how “guitar”ish 吉他 sounds compared to Mandarin!


Sample 10: “Thank you, Teacher”

“Thank you, Teacher” 谢谢老师


Mandarin recording BYHR Notes
 谢谢 sie42sie42 This is the first documented /ie/. Maybe it should be just /i/?
老师 leo11sai24




Working hypotheses

For initials, it seems like we’ve got the following so far and I’m pretty sure there are more. In the Examples column I’m including the tone markings just so you can do a Find in the browser and get to the relevant sample.

initial examples
/b/ bat33, beo13, ben44 (33)
/c/ cet33
/d/ dan212
/g/ gei11, gam33, get44
/h/ hou33, hep33, hu44, hen44
/hl/ hlam24, hlei44(55), hlen33
/ji/ jiou22 [different from /zou/?]
/l/ lok11(21), leok11, lon11, leo42(11)
/m/ mek11
/n/ nou21(44), ni52(42), net21
/p/ peo44
/s/ sep11(22), sei11, sie42, sai24
/t/ ta24
/y/ yet55, yein11, you22
/z/ zon44, zei22, za24, zai11(22), zen44


OK, now the same for finals

final examples
/a/ ta24
/ai/ zai11(22), sai24
/am/ hlam24(13), gam33
/an/ dan212
/at/ hat44, bat33(22)
/ei/ wei22, sei33(11), hlei44(55,33), zei22, gei11
/ein/ yein11
/ek/ mek11
/en/ zen55(44), ben44, hen44, hlen33
/eo/ beo13, peo44, leo11
/eok/ leok11
/ep/ hep33, sep11(22)
/et/ yet33(55), cet33, net21, get44
/i/ ni52(31,42)
/ie/ sie42
/ok/ lok11(21)
/on/ zon44, lon11
/ou/ hou33, nou21, jiou22, you22
/u/ hu44


What about tones? There’s really not enough data yet. My hunches are like this

phonemic category Best examples in this category Notes
Flat high zen44, hat44, hu44, yet55, hlei44, get44 I suspect this will ultimately include all the 33 examples too. Note for example that the number 1, yet, shows up as 55 but also as 33.
Flat low zai22(11), leok11, lok11, sep11 Probably all the 22s belong here. I’m a little confused about whether there might be an even lower tone of some sort — see note about dan212 below.
Rising hlam24(13), beo13, na24, sai24
Falling high ni52(42), sie42
Falling low nou21, net21

Also, possible tone sandhi: two flat-high tones next to each other, the second one is slightly lower, e.g. hat44sei33

Stuff I’m confused about…

I’ve got it 3x in the samples above: wei22, nou44, and [?].
dan212 Not sure if this is a super-low tone or if it’s just another version of flat-low as I’ve got above.


*宾阳话 is a subset of the top level Sinitic fangyan group Pinghua, which is to say Pinghua is parallel to Mandarin, Yue (Cantonese), etc. To paraphrase Wikipedia’s Pinghua entry and Baidu Baike’s 宾阳话 entry, in the past Pinghua was classified as part of Yue, but it was split off in the 1980s. It qualifies as exotic cuz there aren’t many speakers, as Sinitic languages go: total around 2m for Pinghua and 800,000ish for Binyanghua. It counts among its speakers both Han and Zhuang, and there seems to be some serious ethnic mixing according to one genetic study I came across.

"China's tower of babel" and the language / dialect question. Again.

China Realtime Report put up a good piece about Phonemica. I thought the title not bad: “Getting China’s Tower of Babel on Record“.

A lot of you might see the article anyway, but I doubt you’d make it over to the comments, and the very first comment is probably one that Sinoglot readers and writers alike have spent way too much time thinking about:

What is the distinction between a language and a dialect?

Since I thought Kellen’s response gave a pretty nice simple summary, and since I know he’d be too shy to repost it himself, I thought I’d give it its very own Sinoglot post:

The real answer is that there is no answer. The distinction is arbitrary and can be motivated by a number of different factors; It can be political, historical, sociological, or just based on convenience. For example High German, Low German and Dutch form a continuum where a speaker from one end can’t understand a speaker from the other end if each is speaking their own hometown dialects, but speakers from any two neighbouring towns will have little trouble in communicating. China is made up of a number of such continuums, Mandarin being one, Cantonese another, Wu a third. For the project we treat Cantonese as a language and Mandarin as another language, but with a distant common ancestor, the same as Italian and Spanish are related through Latin. This is the reason we tend to group the entirety of our focus in the project under “Sinitic”, referring to any modern language variety that is descended from Old Chinese. The shared relationship of these language varieties is known, and the appropriateness of different degrees of fineness in distinctions between them is different for different situations.


本字、正字 and consistency in transcription

This keeps coming up with transcription work. The question is, when transcribing a person speaking their local dialect, what characters should you use? I provide the following definitions, which are up for debate:

本字 běnzì – The character that most accurately represents the word in etymology. In a way, it shows the cognates.
正字 zhèngzì – The “Standard” character. That which represents the meaning of the intended word for a wider audience.

As a semi-hypothetical example, Dialect X has a word that means “high” or “tall”, read “huan”. It’s cognate with Mandarin 懸 xuán as any educated speaker will tell you. A speaker of Dialect X may write it as 懸, or they may just write 高. They wouldn’t say 高 gāo or a cognate of 高. But then they may assume the rest of the country which doesn’t speak their dialect might not know 懸 as having this meaning, since in Standard Mandarin 懸 means “to hang”. So if you can imagine, they’re still writing in their dialect, but they’ve changed the characters to make it just a little easier to read for a wider audience.


Origins of Khor Ewe Pin

Can Sinoglot readers puzzle out the romanized version of a name of Chinese origins? Chris Waugh writes:

Got an email from a friend about a Malaysian Chinese author whose name is Khor Ewe Pin/许友彬, a bit of googling and threw up an article in Bahasa Melayu that gave as his bahasa ‘Kantonis’ and ‘Bahasa Inggeris’ and ‘Bahasa Melayu’. A bit more googling suggested Khor is a possible romanisation of 许 in Malaysia. I made a thoroughly uneducated guess based on that Bahasa: Kantonis that Khor may be Cantonese, but neither of us has managed to find any more…. I was wondering if somebody in the Phonemica or Sinoglot communities might be able to answer this question: what ‘lect is Khor Ewe Pin? Thoughts?

Chinese: 7 languages and 49 dialects?

UPDATE: Thanks to ahbin in the comments, I’ve added “Ping Speech” (平话) as a dialect of Cantonese. Its omission was an oversight for which I owe maybe 2 million people an apology. That brings us to 50 “sub-fangyan” (次方言) under the original 7 fangyan. This makes the title of my post obsolete, but what the heck. The sub-fangyan vs fangyan decision follows the basic scheme of the Chinese Language Atlas (中国语言地图集)


How should Chinese be categorized, linguistically?

Fundamentals of Chinese Dialect Studies (《汉语方言学基础教程》:李小凡,项梦冰) describes how in the first half of the 20th century, the proposed divisions of Chinese increased from four in early scholarship, up to eleven in one scheme. Now most scholars are back down to seven or eight. But between language change and debates about definition, it’s a question that guarantees academic employment for years to come.

7/49 is the plan I’ve just posted on the Phonemica blog. I’m pasting the chart below. Continue…

Written Taiwanese and Cantonese

With the mercury in Taipei rising incessantly (roads have started melting and all), it seemed as good a time as any to expand the horizons of Sinoglot’s coverage to the Pénghú 澎湖 archipelago, a group of islands in the Strait of Taiwan. Fishing and tourism are the mainstays of the economy on these islands, which are also known in Taiwan for their boisterous religious festivals and the well-preserved local culture.

So, with a little trepidation at flying in a little turboprop plane for the first time, your correspondent bravely went where no Sinoglot post had gone before. It soon turned out that the preservation of the islands’ local culture extends to its language: unlike in Taipei, Taiwanese (Mǐnnányǔ 閩南語 / Táiyǔ 台語) is still going strong on these islands – you hardly hear any Mandarin on the streets, even among the younger generations. I asked a few islanders and they all agreed that almost all kids are still learning how to speak Taiwanese and using it actively in everyday life, again unlike in Taipei.


Jin Chinese?

See below for your chance at glory in the big smackdown: Jin Chinese vs Jìnshǎn Mandarin.

Language, or dialect?

Jin Chinese is right at the heart of the never-ending fangyan / language / dialect debate (e.g. here or here or here on Sinoglot). Jin is categorized by some as a top level fangyan (方言) in its own right, comparable to Mandarin, Cantonese, etc., while others insist Jin belongs under the broad wing of Mandarin where it should be classified as a sub-fangyan (次方言), parallel to, say, the Northeast dialect or Beijing dialect.

The location of Jin speakers makes the whole debate even more intriguing: they’re right next to Beijing, practically!


The Elderly

note: Sinoglot readers rock. Seriously. You guys have consistently provided good discussion, which is what we talked about wanting, what seems like ages ago, when we decided to put this site together. We’ve all been a bit busy these days so the posting has slowed down. To remedy that, I have a few quick posts I’m going to throw up here in hopes of getting some more discussion going. This is the first. Thanks for kicking ass.

I’ve written elsewhere about trying to talk to the elerly in China. On a trip to Henan province last year i was somewhat surprised by the fact that I could actually understand people and communicate with putonghua. I thought that this was a strictly southern phenomena, being unable to talk to anyone over 50, but today it seems to have crept further north than I’d otherwise thought.

Today I was talking to a friend of mine from northern Jiangsu province about dialects and communication. She was saying that her parents, not yet 50 years of age, cannot speak standard Mandarin. I figured it was not a big problem since it was still beifang-hua, so to test I had her run through the usual phrases I make everyone say. Not terribly surprisingly, it didn’t sound much like Mandarin. It was clearly a northern dialect but one that I’d have a hard time to understand in the context of a real conversation. Not yet 50.