Do hanzi work better for captcha?

I’ve gotten pretty used to captcha’s like this:


For the visual-processing challenged, like myself, the worst of them can make it really hard to prove you’re not a machine.

So I was pleasantly surprised to find the hanzi version on Sina’s blog sign-up to be a cinch:


Sure, you have to know the characters. But compared to the roman letter captchas I’ve come to dread, they’re crystal clear. Is Sina the exception, or are Chinese character captchas really easier to decipher?


[Update: Thanks to Confused Laowai’s link in the comments below, take a look at Tianya’s captcha. To echo Kellen: holy crap


10 responses to “Do hanzi work better for captcha?”

  1. Porfiriy says:

    To be honest, I find Hanzi captcha to be super annoying, not for recognizing Chinese characters without the aid of a pop-up dictionary, but for this reason: usually when I’m registering for a site the captcha is surrounded by fields that require Latin input, like email address, or password, etc. So I get annoyed that I have to switch back and forth. Really it’s just two keystrokes but I wish I could just type my way through a form without having to stop and think. But that’s just me having a pet peeve. :)

  2. GAC says:

    I don’t have much problem with any of your examples, except in the middle alphanumeric one, the character between 5 and 3 is just bizarre. Besides that, there are easier and harder captchas in Latin script (I don’t have much experience in Chinese), with some having no deformation of the actual characters, ranging up to ridiculous levels of deformation and repositioning.

    Anyway, Chinese captchas had better not be too much easier. Captchas rely on the fact that computer pattern recognition algorithms, at least what’s commonly available, still aren’t quite as good as a human, but that doesn’t mean they can’t do something. Deforming Chinese characters too much might make things wierd. Maybe the best way to make a Chinese captcha would be to use a different calligraphy style. Maybe 草书, since many Chinese can write it but computers aren’t likely to recognize it.

  3. fremen says:

    The Latin characters are deformed, the Chinese ones are not, so I don’t think this is a fair comparison. If you get the same deformations in Hanzi I think it could be way harder than English ones.

  4. Randy Alexander says:

    Maybe 草书

    Many Chinese can write it, but very few can read it.

  5. Carl says:

    The Perl of human writing systems?

  6. Andy says:

    As a student somewhat specialized in image processing I would assume computer interpreting Chinese characters to be a near impossible attempt, just because there are a heck and a lot of them. As GAC says they would use pattern recognition and in English you would have a twenty-something number of patterns (letters) to compare with. In Chinese you would have XXX patterns stored somewhere and try to find the best mach. Chances of finding the right one decrease drastically. That’s probably why there is less call for distorting them.

  7. If you look at TianYa they have the strangest Captcha system when you register. It gives you 4 characters that is totally construed, just like in normal English captcha, however, it gives you an option out of 9 characters. So you have recognise the right ones and use the numbers an input. Go check it out.

  8. Kellen Parker says:

    Re ConfusedLaowai’s link, holy crap.

  9. Syz says:

    @Andy: I was going to make some kind of argument like that — logically it makes sense; more characters to recognize => less distortion. But then, look at the Tianya example :)

    @Confused Laowai: the Tianya example is beautiful. I put a screenshot up in the original post so those too lazy to read comments could enjoy it too!

  10. GAC says:

    @Confused LaoWai

    Interesting. I like the numbering thing, so you don’t have to remember the pinyin, and the captchas are probably sufficiently distorted to compare with the English ones here, and I have to say — looks harder, actually.

Leave a Reply