Mystery bigram sighting
Anyone know 黛二, dài èr?
黛 itself isn’t that common. Maybe you know it from 黛绿, which the ABC Dictionary has as
dàilǜ attr. dark green ◆ id. beauty in full dress
That’s the only two-syllable word containing 黛 that I find in any dictionary, although it does appear in some foreign names, e.g. 史黛西 (Shǐdàixī) for Stacy. But I find nothing with 二. So why has 黛 hooked up with 二 in Jun Da’s corpus of “general fiction”, to compose bigram #1427 in frequency, the mysterious “黛二”?
Yes, I know it doesn’t matter. Yes, I know the Jun Da corpus has all sorts of limitations, and you often find oddball names and bits of party slogans in the bigram analysis at ridiculously high frequencies (e.g. the corpus also has the bigram 方怡, who Google images seems to indicate is a glamorous person I probably should’ve heard of).
But usually I can at least figure out why the bigram exists. Not so for 黛二. Is it a name, maybe a name from a different language? (That’s my daughter’s guess — you can only imagine what it’s like to be a ten-year-old whose father takes away from homework time to talk about corpora). A mistake?
It’s not as if #1427 is rarefied atmosphere for bigrams. 黛二 is preceded by 开车 (kāichē = drive a car) at #1426.
Any ideas, short of digging into Jun Da’s corpus itself?
Perhaps from references (literal and metaphoric) to the characters Jia Baoyu 賈寶玉 and Lin Daiyu 林黛玉 in Dream of the Red Chamber, who are often spoken of as a pair: 寶黛二人.
Perhaps someone lazily entering 戴尔 (Dell Inc.)?
What was the context you saw it in?
[Sorry for double post]
I say lazy because for my pinyin input, it saves your commonly entered pinyin->hanzi strings. So if you’re entering 黛二 a lot, and then simply type “daier” intending 戴尔, then 黛二 would still be the first choice.
Seems unlikely: it appears in the fiction corpus, not in the news corpus.
Ah, yes, pc is right :-). My pinyin-to-hanzi input results in correct 戴尔 (‘Dell’) but a “creative” rendering of [daier] (as in many blog posts) may well result in 黛二 (or maybe this is done to avoid copyright claims: does a company like Dell “own” and protect its rendering as 戴尔?).
Update: It’s a name from a short story, 黛二小姐. See here.
Nice work, Bruce. Shoulda figured it was a character name. That also explains the 500+ instances in Jun Da’s corpus. Chalk this up to yet another example of the limitations of simple corpus analysis.
pc & Jeroen: pretty creative on the Dell front. I have that “lazy pinyin input” problem with some people’s names that I’ve misspelled once and then continue misspelling because my 输入法 keeps giving the misspelling as an option…
Re two-syllable words with 黛: there is also 粉黛.
Ho Sun Yan: thanks for that. Looking it up, I realized Pleco has reworked the character lookup feature in android (and I wouldn’t call it an improvement)