jco Posted May 20, 2018 at 01:04 PM Report Posted May 20, 2018 at 01:04 PM There are a number of older threads on this, so I am asking this to see if there are any updates etc (and I have some more specific questions). Regardless, I want to be able to type deterministically and I also like the idea of typing reinforcing my writing, so 五笔 it is. I'm on a Mac. - is there a manual with all the rules on how the system works? In Chinese is fine - what are the best apps or websites to practice? - what's the best ime for Mac? - is it possible to get the keystrokes data? I'm a programmer and have some ideas for some apps to helps learning. - for series of characters, I know there are faster ways to write them. Is it possible to know this with just the character data above? Is that algorithm described somewhere? I'm imagining taking all the characters I know how to write (based on Anki progress) and then prompting myself with series of them. Ideally the Backend would know the most efficient way to type them and could detect if the user does it in a more roundabout way. I welcome any other resources, apps, advice, etc Quote
jco Posted May 20, 2018 at 05:59 PM Author Report Posted May 20, 2018 at 05:59 PM The yale site is dead but I did finally find a PDF of the explanation that lived there: http://www.indonesia-kita.com/bahan/Wubizixing for Speakers of English.pdf Edit: and it looks like it's hosted online here: http://chinesemac.org/wubi/xing.html I feel like the hardest and most key ask above is the database of characters that lives behind the IME, especially the special forms (2 stroke words, 2 characters with abbreviated forms, etc). Ideally the data would be in the form of the sub-pieces, and then the key strokes would be based on that. I know this data has to exist, but I have no idea where, and a bit of googling got me know where. At worse, one could try and brute force an IME. 25^4=390625, so it wouldn't be too hard to see what the IME output for every possible 4 stroke key combination. That's extremely lame though! Edit2: Actually I guess it'd be 25^4+25^3+25^2+25=406900. Still, very tractable. Quote
imron Posted May 20, 2018 at 07:25 PM Report Posted May 20, 2018 at 07:25 PM 6 hours ago, jco said: what's the best ime for Mac? Don't use the built-in ones that come with the OS, they are all rubbish. I currently use 清歌输入法, and quite like it. My previous favourite was WBIM which seem to no longer exist (but I think 清歌 is better anyway). If you install 清歌, then they also have all the different lookup tables in txt format that you can process however you like. It sticks them in the following folder: /Library/Input Methods/Qingg.app/Contents/Resources/ as wb_table.txt, wb_reverse_table.txt and so on. I can't help with apps for learning, as all the ones I used to recommend seem to no longer exist. 1 Quote
歐博思 Posted May 20, 2018 at 09:49 PM Report Posted May 20, 2018 at 09:49 PM FWIW some pinyin typing inputs, such as my current Microsoft Pinyin, have a 'u' input mode. For example, typing 品, pin, by typing 'u, kou, kou, kou'. Was able to get 彯 with piao and san, having neither typed it with 'u mode', nor known how the system would classify the right side radical. Quote
Shelley Posted May 20, 2018 at 10:03 PM Report Posted May 20, 2018 at 10:03 PM I don't know if I am presenting the obvious here, but I did I search for Wang Yongmin (the inventor) which led me to this https://en.wikipedia.org/wiki/Wubi_method There was also lots of other things as a result of that search. This was interesting too http://www.wangma.net.cn/ Quote
jco Posted May 21, 2018 at 12:34 PM Author Report Posted May 21, 2018 at 12:34 PM 17 hours ago, imron said: Don't use the built-in ones that come with the OS, they are all rubbish. I currently use 清歌输入法, and quite like it. My previous favourite was WBIM which seem to no longer exist (but I think 清歌 is better anyway). If you install 清歌, then they also have all the different lookup tables in txt format that you can process however you like. It sticks them in the following folder: /Library/Input Methods/Qingg.app/Contents/Resources/ as wb_table.txt, wb_reverse_table.txt and so on. I can't help with apps for learning, as all the ones I used to recommend seem to no longer exist. This is beautiful. This is exactly the response I was hoping for. I have a bunch of personal training methods I plan to program for myself, if it works well perhaps I'll open source it so resources don't keep disappearing...! Quote
jco Posted May 21, 2018 at 05:25 PM Author Report Posted May 21, 2018 at 05:25 PM Here's a question... is it possible to actually type all of the characters in the keys? Are they all in unicode? It seems to me that they are not all in fact in unicode, but I'm not sure. For example, I'm trying to see if I can type the 1st character on the left on the second row of Q in this chart: http://chinesemac.org/wubi/images/wbxbigkeys.gif. It's like the left of 狼 but without the final stroke. To me, in wubi logic it seems like it would be a 丿(t) and then a 丨(h), so it'd be qthl. But that is not the case. In fact I haven't been able to make it. Furthermore, I'm trying to understand when you need to pad the l and when you don't. In the case of the character 勹, it seems like it is qtn2. This raises a couple of questions! First, I don't see why it isn't qtnl. Furthermore, in trying to underscore the logic, I didn't realize that numbers could be a part of it... I thought there was a purely stroke based way to get to every character, but that doesn't seem to be the case. For example, qtn gives: 儿,勹,⺈(weird, if I type qtn3 it comes out looking like this , but I was able to paste it in from the wb_table.txt file...any ideas why my browser is choking on that?). So I'm trying to understand the logic on how these sorts of things are laid out... I read the yale guy's wubi toturial, but he didn't mention numbers as being necessary. Are there other cases where it's necessary to select? The whole appeal of wubi (for me at least) is that it is deterministic (I hate the type and select flow of pinyin), so I'm trying to understand how it works. My ideal is to be able to have a list of each sub-component in unicode, then the key it is on. I guess also relevant: do all of the wubi implementations have the same constituent characters assigned to the same characters? I wasn't able to find a breakdown for 清歌. For example, is the image I linked above actually the right one? Thank you! Quote
imron Posted May 22, 2018 at 01:49 PM Report Posted May 22, 2018 at 01:49 PM On 5/22/2018 at 1:25 AM, jco said: is it possible to actually type all of the characters in the keys? No. Incidentally, those 'characters in the keys' are usually referred to as 'character roots' (字根). On 5/22/2018 at 1:25 AM, jco said: Furthermore, in trying to underscore the logic, I didn't realize that numbers could be a part of it They aren't necessarily, but different IMEs often add their own enhancements to make things easier. That being said, there are a small number of characters that do have conflicts (去 and 云 - fcu) is another one. They are still deterministic though in that the ambiguous characters will always be presented in the same order (and not always swapping about) so when they do come up you can just learn it as adding a 2 or a 3 to the end to get the one you want. On 5/22/2018 at 1:25 AM, jco said: do all of the wubi implementations have the same constituent characters assigned to the same characters? If by constituent characters you mean character roots, then yes. Unless you are talking about 五笔98 which uses a different set of mappings, or 五笔18030 which added roots for typing traditional characters but still kept all the standard 五笔86 mappings too. The image you showed are the correct key mappings for 五笔86 which is what most people use. The key code for 清歌 is igsk, which takes the first two roots from 清 (氵王)and the first two roots of 歌( 丁口). You'll often find yourself thinking "it's a bit of a stretch to use such and such a root for such and such a character, and you'd be right, sometimes it is a bit of a stretch. The system is not perfect, and sometimes you have to bend yourself to its logic. One that used to always throw me off was in things like 追 which you might logically think of as yhnp (丶丨hook辶) but is actually wnnp (亻hook, hook, 辶), in other words, the 丶丨is replaced by a single 亻which I have always found weird. See also 段, which is wdmc for 亻三 几 又 (亻 for this is also a bit of a stretch in my mind, but it is what it is and can't be changed so you just have to let your mind stretch). My guess is that the creator of 五笔 did things like this to help reduce the number of keycodes that had character conflicts/ambiguity in them) Note: in the above examples I haven't typed the exact root due to the above-mentioned problem in not being able to type all the roots, but you should be able to get the basic idea). That being said, different implementations usually have their own slightly different vocabs, so you'll be able to type 清歌 using 4 keys (rather than 8 ) with the 清歌输入法, but probably not other ones. Likewise some IMEs will have 你好 as a single 4 character key code (wq vb) and others won't. The IMEs will be mostly all similar though for the most commonly vocab. On 5/22/2018 at 1:25 AM, jco said: My ideal is to be able to have a list of each sub-component in unicode, then the key it is on. Not gonna happen unfortunately because unicode doesn't have support for each root. You can see which ones have unicode support here. Some IMEs also rely on the private use area of unicode to provide other roots (as you discovered above with ) but you need to make sure you have a font that supports it. Quote
大块头 Posted September 7, 2020 at 10:56 PM Report Posted September 7, 2020 at 10:56 PM I've spent a good chunk of the day trying to wrap my head around the 五笔字型 86 system. My problem is that every table of 字根 seems to list a slightly different set of squiggles for each key. Can anyone recommend a source that gives full 汉字 examples for each of the 字根? I found this, but it isn't comprehensive enough. Some good references I found: 五笔查 app for Android online IME Quote
imron Posted September 8, 2020 at 05:22 AM Report Posted September 8, 2020 at 05:22 AM The ones on baidu baike are likely good enough. There’s not much point having a ‘complete list’ for each key. Wubi is learnt through typing, not through memorizing per key shape lists. Use the lists to get a general feel Of how things are laid out, and then just concern yourself with typing. For that reason I’d also avoid an Android app and try to find something for a desktop/laptop with a proper keyboard. Quote
大块头 Posted September 8, 2020 at 01:44 PM Report Posted September 8, 2020 at 01:44 PM OK, thanks for the advice. I'd still recommend this app though. You enter characters and it shows you how they're decomposed into the 字根. 1 Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.