Wubi Insanity :(

July 9, 2012 at 06:00 PM

so i was trying to go down this list:

http://lingua.mtsu.e...st.php?Which=MO

and i couldn't even get through 10 characters, because i got stuck on 我

http://www.archchinese.com/chinese_english_dictionary.html?find=%E6%88%91

my mac is telling me the code is "TRNT" although experimentally "Q" also works.

i can definitely see where the "R" is coming from and i don't have any major problems with accepting "T" in the first position ... but how did the second "T" get in the 4th position after the "N" ???

shouldn't the last key either be a hook or a right falling stroke ? why is the last letter a left falling stroke ?

July 9, 2012 at 11:18 PM

My guess it's from the 丿that occurs before the final 丶. Not sure why this is the case though instead of counting the 丶 as the last stroke. In practice though, 我 is one of the single key characters, and I almost always just use Q to type it (unless I'm typing it in a two character word like 我们 and I can use the multi-character sequence TRWU).

In fact, 8 out of those 10 are all single key characters:

的 R, 一 G, 是 J, 不 I, 在 D, 人 W, 有 E, 我 Q.

July 10, 2012 at 06:35 AM

yeah i know most of those are single key - that's why i decided to start with them, so i don't end up memorizing longer codes for where short ones are available.

but tell me - is it even important to know the logic behind a certain code ? or do you just memorize the movement of fingers ? when i type in English it's certainly the fingers that do all the thinking - is it the same in Wubi or is the logic of it somehow important ? and is it only important at first ?

what i'm really asking is - how concerned should i be with the fact that TRNT for 我 doesn't make sense to me ? i can figure out the logic for perhaps 80% of characters.

?

July 10, 2012 at 07:04 AM

The last dot seems to be systematically disregarded, or written as last-but-one stroke, in a number of characters with 戈 (see attached file, mix of the Jun Da data + additional Wubi column + "grep" of final "t" in the wubi column + manually edited to remove unrelated characters).

戊 is especially interesting : DNYT.

I don't know whether all characters with 戈 have this peculiarity (can't think of an easy way to check this).

There must be some obscure Wubi rule behind this...

Interestingly enough, in my Wubi manual the wubi86 part just lists the last T of TRNT as meaning 丿, but in the Wubi98, the new code is TRNY with Y meaning 丶.

This is a very interesting mystery, but I'm late for work now

halberd_ends_with_t.txt

July 10, 2012 at 09:52 AM

ok so "RT" produces 手 and "AGNT" produces 戈 , also "RTNT" produces 扬长避短 ! ! ! but "TRNT" produces 我

where do you get a "wubi manual" ? i want one ! but it has to be in English because i'm a noob. and where did you get that TXT file from ?

i only have these two pages:

http://www.yale.edu/chinesemac/wubi/xing.html

http://en.wikipedia.org/wiki/Wubi_method

and the first one lately will not display characters ...

basically - can you suggest material ? ( in English )

July 10, 2012 at 10:31 AM

There is indeed an "obscure" rule, which is one of three such.

　C、“我”“戋”“成”等字的“末笔”，由于因人而异，故遵从“从上到下”的原则，一律规定撇“丿”为其末笔。

如：我：丿扌乙丿（TRNT，取一二三末，只取4码）

戋：戋一一丿（GGGT，成字根，先“报户口”再取1、2、末笔）

成：厂乙乙丿（DNNT，取一二三末，只取4码）

The three rules can be found on http://soft.zol.com.cn/22/224235.html.

I don't have time to translate at the moment, but the paragraph I quote says, more or less, that since people write characters ending in the 戈 element according to personal preference, Wubi uses the rule "top-to-bottom" to designate 丿 as the last stroke.

I think mastering Wubi well and efficiently requires three things: understanding it's principles, memorization, and practice. Excellence in one of the three can compensate for deficiencies in one of the others, according to your learning preferences. I might be willing to co-lead a Wubi study group if there is sufficient interest.

July 10, 2012 at 04:53 PM

@altair: so there is a rule Thanks for the link. Well, if it's not on the Yale tutorial, it's an "obscure" rule hehe. I skimmed the explanation in my little 电脑小百科精华读本：五笔字型学,用,查 but at first glance they don't mention it either...

How odd that the Chinese can write either 丿or 丶 as the last stroke of 戈 but we foreigners are taught to always write the dot last.

@neurosport: sorry, no English resources beyond the Yale tutorial. Perhaps someone should contact the tutorial author about the additional rules in Altair's link.

What I call my "wubi manual" is just a little book with a few chapters about wubi theory (in Chinese) and then a list of characters with their wubi codes (ordered by pinyin). Next to each character, the only comment is a list of the shapes that correspond to each letter of the code.

As to the text file, as I mentioned it was obtained programmatically and manually, from the Jun Da list which link you gave in your first post, and a Wubi code file. The wubi code file is basically a lighter version of a wubi manual, it has no pinyin and no explanation, it's just a list of characters and wubi codes. Such files are used by Wubi IMEs (the program which allows you to type in Wubi). See this post for a link.

Edit: nice beard BTW.

July 10, 2012 at 05:23 PM

It's a bit complicated because stroke order nowadays is dictated by polity and often depends on which country you're in. Traditional (calligraphic) stroke order sometimes allows for several orders for a given character.

Finishing with 丿 is the ROC (Taiwan) standard. 丶 is the PRC (Mainland) standard. I believe that the major reason for the divergence is that the Taiwanese writing used to favour top-to-bottom direction in writing, while the PRC switched to left-right early.

I'm not sure which was the "original", traditional stroke order. Wikipedia claims that the last stroke was 丶 like in the PRC standard, but it is possible that both were accepted.

Since Wubi was designed for simplified characters, I find it strange that it doesn't follow the PRC stroke order in this case.

July 10, 2012 at 05:47 PM

Thank you for the explanation.

Well they corrected this issue in Wubi98, but it does not seem to be widely used.

July 10, 2012 at 06:12 PM

I think ROC may be an/the outlier here:

戈

我

戔

I'm not sure about Hong Kong, and I don't remember where to find them...

[Edit] At any rate, if the PRC prescribes the dot last, why wouldn't Wubi follow that? That seems strange.

July 10, 2012 at 10:36 PM

Since Wubi was designed for simplified characters, I find it strange that it doesn't follow the PRC stroke order in this case.

It wouldn't surprise if it was done in part to help avoid clashes.

July 12, 2012 at 09:29 PM

is there at least a list of two character etc wubi codes ? the "find input code" function on my mac only shows codes for a single character.

July 12, 2012 at 11:01 PM

Most Wubi typing trainers have these lists built in. Probably the best thing is just to practice with them. I don't know any specifically for the Mac, and the one I used to like for Windows doesn't seem to be available any more, however Kingsoft make an acceptable typing trainer with Wubi support: http://www.51dzt.com/

For the Mac, I'm not sure which IME you use, but I like WBIM, which does show two (and more) character codes.

July 13, 2012 at 07:11 AM

from that file (GB), saved as UTF8 and parsed with grep -E ' [^\s][^\s]' => list of wubi codes for words and expressions of 2 and more hanzis.

I don't know whether all Wubi IMEs implement the exact same list of multiple characters words.

srcgnudarwinorgWuBi_2andmore_u8.txt

July 13, 2012 at 08:22 AM

No they don't. The bulk are similar, but each IME tends to have their own words you won't get as multi-character shortcuts in other IMEs.

Sign In

Wubi Insanity :(

Recommended Posts

neurosport

imron

neurosport

edelweis

neurosport

Altair

edelweis

renzhe

edelweis

Glenn

imron

neurosport

imron

edelweis

imron

Join the conversation