Google Translate head games

Spot the difference:


No, I didn’t mess with anything. Yes, I did eventually realize that it was translating from “English”. The problem goes away when you switch from English to Chinese. Explanations?

Hyperpolyglot help

One of the things that I’ve become increasingly more fascinated with (and have been fascinated with since as long as I can remember) is the question of how fastest (and most efficiently) to learn a language to a basic level.  By basic level I mean well enough so that your next step would be to study intermediate language learning material in that language: you’ve mastered the alphabet, the phonemes (but not all of the conversation speed phonetic change “rules”), all but infrequently used (or especially formal) syntactic patterns, and a vocabulary of at least 500-1000 of the most common words.  You should be able to pick up a newspaper article in that language and say what it’s about (and not be too far off).  You should be able to complete simple everyday tasks that require speaking and listening in that language.

When I was fresh out of grad school (mid ’90s), I remember discovering a company that specialized in quick language learning through reading.  They had a neat program that had translated mouse-overs for words, and ways to save vocabulary lists — something that we take for granted now, but was quite revolutionary back then.  The other day I was wondering if they were still around.


Another take on OCR

Ok I admit it. I watched Star Trek when I was younger. And universal translators were damn cool. Of course a part of me hated the idea, since learning the language is almost as much fun as actually using the language, at least some times.

So I’m a big fan of things like Pleco‘s OCR. I only wish more people were providing such tools. It would be insanely useful for quick-skimming Korean, for example. Today, somewhat late to the game I admit, I came across WordLens. While not really related to China or Chinese in any way, it’s still pretty cool and worth sharing.


Tube foot

On using Google images for translation.

Tube foot? Maybe a fungus you acquire from spending too much time in front of the TV?

No, it’s one of these dangly things:


[photo credit]

Who’da known? Not me. But when your 9-yr-old daughter is working on her writing homework and wants to know the word for “those suction thingies that starfish have” your first move, naturally, is to go to Google images with that search term.

That’s what I did, and it wasn’t more than a few seconds before we got the right thing and the technical English term to boot. But then she reminded me:

“Daddy, so how do you say ‘tube foot’ in Chinese?” Continue…

Xiao’erjin is not quite Pinyin

Xiao’erjin (alternatively xiao’erjing¹ 小儿经) is the name of a form of transcription for Mandarin and related languages. Rather than using Cyrillic or Roman letters, the Arabic script is used. China has had a large Muslim population for about as long as there have been Muslims, and it was among those of them who were less likely to have a traditional classical education that the system was used.

The structure is fairly simple. Syllable initial consonants are written with a single Arabic letter. The final then was primarily done with harakat or vowel diacritics. Before Annals of Wu, was blogging on xiao’erjin and Chinese Islam in general on another site, appropriately enough called xiao er jing.


So you want to count unique Chinese characters in a document…

On why you should check out Chad Redman’s corpus tools at zhtoolkit.

Does this chart qualify as corpus porn?

Microsoft Excel non-commercial use - Book1  [Compatibility Mode] 152011 41849 PM.bmp

One of the benefits of rampant piracy is having access to digital texts for pretty much any popular novel in Chinese. Finding a text is a wee bit harder than a Google search — it helps to go to Baidu and you have to dodge all the “we just want your email” scams — but so far I’ve been able to get the books I want without skanky computer viruses or an inbox full of ENlarGement ads. Most recently, with a little creative cut and pasting, I’ve managed to get nearly a full copy (just missing the last few chapters) of 《兄弟》,  Brothers, a novel I’m reading right now. Continue…

Ngram this! — The 中文 Ngram challenge

Original title: The most fun you can have (legally) on a Saturday night in Beijing outside the fifth ring

If you haven’t already seen what Google has come up with…

Google Labs - Books Ngram Viewer

…then you’re probably in danger of becoming an offline recluse who lives in Beijing exurbia and considers “social interaction” giving a nod to the elderly gentleman who walks by every morning as you exercise at 5:30am.

But if you’ve got that problem, then why not submit your favorite Ngram sets in the comments and win the Ngram challenge! (Award amount to be announced as soon as sponsor is finalized) Continue…

Technical term translation

For your summer translation fun, here’s a summary of comments from a recent post on how to translate technical terms from English to Mandarin.

To make it simpler, let’s assume “ABCD” is the English* term you want to translate. The suggestions are:

  1. Baidu search for ABCD
  2. Iciba search for ABCD
  3. Web search for ABCD along with a Chinese term that is used within the same general area of knowledge as ABCD (example)
  4. Web search for ABCD plus 翻译
  5. Web search for ABCD plus 英文怎么说
  6. Use a hack of the Chinese Wikipedia by going to (example)
  7. Use the Modern Chinese Scientific Terminologies site (thanks, Andre, in comments below)
  8. Google’s translation tool (thanks, Transliterationisms in comments)
  9. Google’s dictionary with bilingual features. Instructions: “Select a language pair at the top, then type in the word in either language, it auto-detects and responds appropriately.” (thanks again to Transliterationisms)

Thanks for all the useful tips. Did I miss anything?

Also, for the record, this post is now archived in Sinoglot’s brand new Language Tools, Tips, Resources page.


*Anyone try these with a language other than English?

Cross-referencing "crosstabs"

How do you find precise translations for technical terminology? One of the convenient things about dealing with a mainstream language like Mandarin is that pretty much every English technical or trade term in every subject you could imagine has a Mandarin equivalent.

The devil is in finding it. Bilingual dictionaries are good for mainstream stuff, but they don’t tend to include the lingo that’s an inevitable part of any profession. Continue…

Voicedic — syllable pronunciation in 7 versions of Chinese

As you get so amped up for Children’s Day that you can’t sleep, I recommend you while away the wee hours at 语音字典 / Voicedic and then let me know what you think. I just came across it via PKUCN (the language discussion board that we mention on Sinoglot occasionally) and am intrigued.

Caveat: I’m not entirely sure how you’d use it. It offers per-Chinese-character audio in seven “dialects”: