本字、正字 and consistency in transcription

This keeps coming up with transcription work. The question is, when transcribing a person speaking their local dialect, what characters should you use? I provide the following definitions, which are up for debate:

本字 běnzì – The character that most accurately represents the word in etymology. In a way, it shows the cognates.
正字 zhèngzì – The “Standard” character. That which represents the meaning of the intended word for a wider audience.

As a semi-hypothetical example, Dialect X has a word that means “high” or “tall”, read “huan”. It’s cognate with Mandarin 懸 xuán as any educated speaker will tell you. A speaker of Dialect X may write it as 懸, or they may just write 高. They wouldn’t say 高 gāo or a cognate of 高. But then they may assume the rest of the country which doesn’t speak their dialect might not know 懸 as having this meaning, since in Standard Mandarin 懸 means “to hang”. So if you can imagine, they’re still writing in their dialect, but they’ve changed the characters to make it just a little easier to read for a wider audience.

Continue…

Written Taiwanese and Cantonese

With the mercury in Taipei rising incessantly (roads have started melting and all), it seemed as good a time as any to expand the horizons of Sinoglot’s coverage to the Pénghú 澎湖 archipelago, a group of islands in the Strait of Taiwan. Fishing and tourism are the mainstays of the economy on these islands, which are also known in Taiwan for their boisterous religious festivals and the well-preserved local culture.

So, with a little trepidation at flying in a little turboprop plane for the first time, your correspondent bravely went where no Sinoglot post had gone before. It soon turned out that the preservation of the islands’ local culture extends to its language: unlike in Taipei, Taiwanese (Mǐnnányǔ 閩南語 / Táiyǔ 台語) is still going strong on these islands – you hardly hear any Mandarin on the streets, even among the younger generations. I asked a few islanders and they all agreed that almost all kids are still learning how to speak Taiwanese and using it actively in everyday life, again unlike in Taipei.

Continue…

Found Characters — take 2

When I was about to hit Post on the first “found characters” piece the other day, I could still hear that nagging voice: “Dude, you never write things clearly the first time. Just wait. Post it tomorrow.”

From the email, it appears I shoulda listened. I got some interesting photos, but only one example of what I had in mind. So now, belatedly, here’s my attempt at a definition:

Found Character: Something that can be recognized as a Chinese character but is accidentally such. That is, it is either made by nature or made by a person who had no intent of communicating with hanzi.

For what it’s worth, Found Character is a play on Found Art — not that I’m trying to compare hanzi to urinalsContinue…

Teeny tiny little "found characters"

You’ll excuse the artist for hooking his 小 the wrong direction, since he’s a bit of a birdbrain.

Still, I liked the style, and the medium, since that’s about the best thing one could do with Beijing’s eighth-inch of dust-dry snow.

Does anyone know if “found characters” have a formal name? I’m sure there’s some internet hound who’s collected ten thousand, but I don’t know how I’d search for them. Continue…

I shall be telling this with a Cai…

There’s nothing better than mile three of a glorious late fall trek through Beijing, when the winds have brought a respite from the usual bong backwash that passes for air, and the green grocers have graciously provided a living example of character simplification, especially one as logical as

That is: 大白菜 (dà báicài is what I would call napa cabbage) but with 菜 written as 才+艹. Continue…

Post posted post haste

Here’s a fragment I just bumped into in 《黄金时代》 (aka Wang in Love and Bondage — some details from Paper Republic):

校长长着长长的鹰钩鼻子,到处窥探……

Doesn’t sound like anything unusual if you read it aloud: “Xiàozhǎng zhǎngzhe chángcháng de…”, but the four 长s are kind of neat. And it makes you wonder what the record is for this kind of thing. I’d prefer to define it a bit conservatively, something like…

  • Take any 10 characters in a text
  • Disregard punctuation
  • Don’t allow contrived texts that are specifically designed to include lots of the same character. Just plain old writing.

What’s the greatest number of same characters (out of 10) out there? The one above is an easy 4/10, but I’m sure there’s better.

[If you think the title of this post is bad, you should have seen the one I trashed]

China Illustrata

放假了!

In a recent Language Log post, Hanzi Smatter circa 1700, Victor Mair discusses what appear to be fake Chinese characters on a European work of art.  In the comments, he adds a reference to a 1666 encyclopedic Latin work on China:

The great polymath, Athanasius Kircher (1601/1602-1680) had himself never been to China, but had a deep interest in Chinese characters, which are featured prominently in his China Illustrata (images readily available on the Web). Although his depictions of Chinese characters are painstaking, they are often so fantastically elaborated that it is impossible to determine which ones he was trying to represent. Continue…