You might remember the discussion we had last year about the peculiar usage of the exclamation “!” and other punctuation marks in modern mandarin. I bring this up again because in yesterday’s news there was a remarkable piece of writing that illustrates the phenomenon.  Interesting too because the author is an admired member of the internet elite, speaker of English and used to working with foreigners: none other than Jack Ma, the founder of the Alibaba empire.

You can read all about it in this Forbes blog post. To make a long story short: Mr. Ma was slightly annoyed when he found that dozens of his employees were using the company to collude with outside swindlers, and he wrote a circular letter containing, in its Chinese original:

– 11 periods
– 21 exclamation marks.

In the first half of the letter it is even more pronounced, with a total of 12 exclamations for only 4 periods, and then those 4 look like they've been forgotten there  at the end of the paragraphs.

Xiao’erjin is not quite Pinyin

Xiao’erjin (alternatively xiao’erjing¹ 小儿经) is the name of a form of transcription for Mandarin and related languages. Rather than using Cyrillic or Roman letters, the Arabic script is used. China has had a large Muslim population for about as long as there have been Muslims, and it was among those of them who were less likely to have a traditional classical education that the system was used.

The structure is fairly simple. Syllable initial consonants are written with a single Arabic letter. The final then was primarily done with harakat or vowel diacritics. Before Annals of Wu, was blogging on xiao’erjin and Chinese Islam in general on another site, appropriately enough called xiao er jing.


Beijing bi-literacy

I like exploring almost as much as I hate tourism. The recent trip to Sinoglot’s Xiamen office, thankfully, was 80% exploring and only 20% t**rism. Even better, my parents, who have been visiting over the new year, are both more explorers than tourists. So when I take them on a hike in the western hills of Beijing, trying to find “a different trail that I’m sure could get us up to that pagoda,” and we end up on a desolate road squeezed between the base of a hill, abandoned development projects, and some rather weedy graves — they’ll enthusiastically tell me they had a great time.

stonescript

Another Chinese vs English sign test

Remember the question Sima brought up about how much surface area was needed to communicate equivalent amounts in Chinese vs English?

Looking back through that article and the comments, I’d conclude the following:

  1. For unpearly prose at any rate, the surface area needed is probably about the same between the two languages.
  2. It still might be the case that non-prose signs (e.g. a sign with succinct phrases or just a word or two) could be shorter in Chinese than English
  3. “Native readers” of Language A can read Language A from a greater distance than they can a non-native Language B (whether A = Chinese or A = English)

All this came to mind at the Xiamen Botanical Gardens (also mentioned here) when I saw the sign below*:



Korean Grammar by way of Characters

[The following is a guest post. Yi Chonsang lives in Seoul where he works as an anonymous grunt for a multinational.]

I spend most of my time at work. I would love to enroll in a language program but it’s just not an option with my schedule. Meanwhile a friend here in Seoul is studying Korean full time, and he’s gotten pretty good at it. Now he’s looking to learn hanja, the Korean name for Chinese characters. He recently showed me the book he’s using for this, Learning Hanja the Fun Way.

He said it has been very useful so far for learning hanja. It must be because he’s going through it fairly quickly. I had a different idea. I thought it would be a good way for a Mandarin speaker to learn Korean grammar by going through the book in reverse. Since I studied Chinese before, that’s exactly what I’m now trying to do.

The book is divided into lessons, each composed of a list of vocabulary words followed by a reading comprehension paragraph in Korean and English. The words from the vocabulary show up in the reading as hanja (漢字 hànzì); the list of hanja are given with hangul (the native Korean writing system), a definition, and a little picture to show you what it’s supposed to be representing. This is like what you find in books like Remembering the Kanji/Hanzi.


Ryakuji in Mandarin

In Japanese they’re called ryakuji りゃくじ. In Korean, yakja 약자. The corresponding characters are 略字, pronounced lüè zì in Mandarin. They are the unorthodox simplifications that are seen in handwritten texts from time to time. They are not in any official list of approved kanji/hanja/hanzi, and you won’t really learn them in school. But they are used.

Think 仃 for 停 but lacking the authority once (briefly) held by 仃. Or, think of all those times you wrote 旦 in place of 单 蛋 or 弹 in your notes in class, because you couldn’t be bothered by all those strokes at the time. I know I’m not the only one to do this.


Is Huihui literate?

Translation from this article (thanks Joel for the link)

The tone is a bit maudlin, but this article captures well the sense I often hear from adults that Pinyin is almost a secret code. They find it very hard to read (naturally, I’m not implying anything intrinsic to the script, just that if you’re not used to it, it’s rather slow going) and often seem captivated by the idea that kids in first grade can use it to write out comprehensible language, even, as in Huihui’s case, to express heartfelt thoughts.



Mother bursts into tears at daughter’s first “Pinyin Diary”

2010-10-29 10:46:37

pinyin pic卉卉的“拼音日记”。

(Picture) Huihui's Pinyin diary

An Answer to Character Encoding Problems

A long while back I wrote a short series of posts on a small range of topics centered around the creation of characters, both modern an old. At the end of one such post, I mentioned that I had a solution to the problem, however I never got around to posting my solution, in part because I felt I couldn’t articulate the idea as completely as it seems to be in my head. Then a recent comment by 慈逢流 got me thinking an answer was fair. This post is my attempt to provide one.

The Problem: Limited Characters

There are a number of characters that have existed in traditional sources that simply cannot exist on computers today, at least not with any wide use. There are obscure characters like the rare family name ben 㡷, which is composed of 本 under 广. These are characters which are encoded in unicode, however unavailable at least on the device with which I am currently writing this post. That’s primarily a font issue, but it goes beyond that. A character exists, for example, composed 林, four times in a square format. Even if one were to create a font with this character, one would need to either have it replace another existing glyph, or assign it to a special use area and then do some fancy replacement string coding for it to be shown. Either solution is not really a solution. Font encoding as we currently know it is insufficient for the full range of Sinitic characters. Even if more glyphs were added to the Unicode standard, which is constantly being done, it is insufficient.


On the Limitations of Characters and Dictionaries

I have a few friends who are in the very early stages of character acquisition. As a result a few questions have come up, such as “how many characters are there?” which inevitably leads to the question of whether or not someone could just go and make up their own character.

So to illustrate, I bring your attention to a character allegedly created by Du Dingyou 杜定友 in 1914. Leading up to the May Fourth Movement, it was a good time for characters, seeing the invention of 她 tā (she) by Liu Bannong 劉半農 a few years later and subsequently popularised by our old friend YR Chao 趙元任 (Zhāo Yuánrèn).


Hanzi vs Pinyin, in case you hadn’t heard enough

Just procrastinated my way into a (sort of) recent Language Log post by Victor Mair. The subject is whether Chinese characters are “necessary” for writing Chinese. There are 62 comments at this writing and a frenzy of emotion. One of the key quotes from Mair:

My rule of thumb is always this: if homography were a problem in (more or less) phonetic scripts based on real, spoken languages, then homophony would be a problem in the speech of such languages.

I’m not trying to pick on words, but this looks to me more like a tautology than a rule of thumb. By definition, spoken language written in a phonemic script is not going to have homophony problems unless the spoken language has homophony problems.

So why the big debate over whether Mandarin "can" be written in Pinyin? It's helpful to parse the question a bit. The real issue is whether Mandarin as it is currently written could be written successfully using Pinyin. That's the only case of serious interest. The other two — (a) writing Classical Chinese in Pinyin, or (b) writing spoken Mandarin in Pinyin — should be universally acknowledged as (a) impossible, and (b) a cinch.