CapnKernel Posted December 31, 2007 at 07:46 AM Report Posted December 31, 2007 at 07:46 AM Hello, Does anyone have any advice for me on how I can do pen input of hanzi under Linux? (Apart from "Switch to another OS") Regarding software, I found the following programs for doing hanzi recognition: http://kiang.org/jordan/software/hanzilookup/ http://tomoe.sourceforge.jp/ http://hanzirecognizer.sourceforge.net/ Has anyone used these for everyday input? Are there alternatives which integrate better into X (like PenPower etc for Windows)? Regarding hardware, are there any low-end tablets anyone can recommend that will work for this job under Linux? Hoping to hear from you, Mitch. Quote
character Posted December 31, 2007 at 03:30 PM Report Posted December 31, 2007 at 03:30 PM Does anyone have any advice for me on how I can do pen input of hanzi under Linux?[...] Regarding hardware, are there any low-end tablets anyone can recommend that will work for this job under Linux? While it's always worth googling to double-check a specific model, tablets with the Wacom digitizer should work fine. In Ubuntu 7.10, just edit the xorg.conf file to enable the Wacom support and restart X. FWIW, I'm using a ThinkPad X60 Tablet. Section "ServerLayout" Identifier "Default Layout" Screen "Default Screen" InputDevice "Generic Keyboard" InputDevice "Configured Mouse" # Uncomment if you have a wacom tablet InputDevice "stylus" "SendCoreEvents" InputDevice "cursor" "SendCoreEvents" InputDevice "eraser" "SendCoreEvents" InputDevice "Synaptics Touchpad" EndSection I can't help you on software except to say Wenlin (including its character recognition) runs fine under Wine in Ubuntu 7.10. Has anyone used these for everyday input? I would be surprised if anyone does. If you're going to be writing a lot of Chinese, pinyin or some other keyboard-based method will be much faster than any handwriting recognition. Quote
CapnKernel Posted January 6, 2008 at 01:14 AM Author Report Posted January 6, 2008 at 01:14 AM Thanks for that, character. Actually, it's for my (Chinese) wife, who doesn't know pinyin or wubi or cangjie or any other entry method. But she sure knows how to write! I'm hoping that your instructions with a Wacom tablet, and what I found below, might help. I found two other options (which I'll post here in case anyone else is interested in the future). The first is a program that IBM released several years ago called "CCR". I can find mention of it, but wasn't able to download the tar file: http://google.com/search?q=ibmccr http://www.linuxfans.org/nuke/modules.php?name=Site_Downloads&op=geninfo&did=562 IBM says they have withdrawn it. The second is a handwriting module which is part of the "Unihan" suite, which I can find in two places. The first is as part of "RAYS Linux", a chinese Linux distribution: http://www.sw-linux.com.cn/baihong/ To see it in action, look at section 5.1.5 of the user manual: http://www.sw-linux.com.cn/download/os/RAYS_LX_1.5_manual.pdf (I must say, RAYS looks very impressive) The second place is the handwriting recognition engine at NCIKU: http://www.nciku.com/ Not sure if there's any connection other than both being called "Unihan" and both being handwriting recognition engines, but there may be. Hmm, tempted to try and lift Unihan from RAYS.... :-) Mitch. Quote
character Posted January 6, 2008 at 10:37 AM Report Posted January 6, 2008 at 10:37 AM I guess pinyin might be a problem if you wife doesn't speak standard Mandarin. If she does, I bet she would find it easy to learn and use. I did some searching, and found these possibilities for handwriting recognition: http://www.kiang.org/jordan/software/hanzilookup/hanzidict.html http://www.kiang.org/jordan/software/hanzilookup/ http://tomoe.sourceforge.jp/cgi-bin/en/blog/index.rb http://www.mandarintools.com/ Quote
ljbuesch Posted January 6, 2008 at 09:14 PM Report Posted January 6, 2008 at 09:14 PM Capn, I am a developer of the Hanzi Recognizer and can say that it is an excellent program . I recently created this software with a friend in a college course, but we both still plan to make improvements and enhancements to the program. I would say that the only drawback to our program is that to get the most accurate results, you need to enter the characters in with their proper strokes, in the correct order. Currently, you don't need a pen to input characters, but it does make it a bit simpler. We only have about 20% of the characters defined in our database and are attempting to get more characters entered. We also hope to release a tool that will allow the community to train the characters thus making our database bigger more quickly. Regarding the "Unihan" information, we actually use the Unihan database as one source of information that we gather. If you have any questions about our software, please let me know, my contact information can be found at the link you provided. Thanks, Logan Quote
CapnKernel Posted January 6, 2008 at 10:42 PM Author Report Posted January 6, 2008 at 10:42 PM Hi character, Thanks for those. (The ones you found are precisely the ones I mentioned in my first post. That you didn't find any different ones tells me we've found all that's out there to be found!) I downloaded RAYS 2.0 but I can't get the CD to boot. I'll keep trying. Mitch. Quote
marscheese Posted January 7, 2008 at 02:46 AM Report Posted January 7, 2008 at 02:46 AM Thanks for the links everyone! I am also a developer on the Hanzi Recognition program. I have taken a look at many of the links posted here, and haven't seen a lot of this software before. There's some good stuff out there, I'm glad to see that there are so many people contributing. I have played around with some of the handwriting recognition programs that have been listed (HanziLookup, Pablo), and they do a fine job. I have noticed that these programs require proper stroke order. This is one feature of the Hanzi Recognition program that is currently being worked on (see our sourceforge website for more details). Hanzi Recognizer also has the ability to search for individual characters by definition or by pinyin. We use a number of databases (including the Unihan and CEDICT) that our used in our searches, but unfortunately the number of characters that can be recognized via handwriting recognition is only about 20%. It's definitively still a work in progress, but has some really nice features thus far. If anyone has any questions or comments, please e-mail either myself or Logan (who posted earlier). --Nathaniel Hobbs http://hanzirecognizer.sourceforge.net Quote
CapnKernel Posted January 7, 2008 at 05:07 AM Author Report Posted January 7, 2008 at 05:07 AM ljbuesch wrote: We only have about 20% of the characters defined in our database and are attempting to get more characters entered. If you're looking for a source of stroke data, here's one idea. Erik Peterson's mandarintools.com has an animated character viewer written in Java. I can't find a direct link to it, but here's an example page: http://www.mandarintools.com/cgi-bin/showstrokes.pl?unival=%E4%B8%80 Note that the stroke data is given in the HTML source. For example, the URL above is for the character "一", and the stroke data is "" #1PR:17,122;231,122;243,106;262,129;32,129;24,131;17,123; ". How to interpret that? I don't know. If Erik is amenable to having his data used this way (not sure where he got it from), he might be able to tell you, and it may be something you can train your recogniser with. Regarding the "Unihan" information, we actually use the Unihan database as one source of information that we gather. Do you mean the well-known "unihan.txt" from the Unicode Consortium (v. useful)? The one I mean is the name of a chinese input platform written by sw-linux for their RAYS Linux. See here: http://www.sw-linux.com/en/scripts/main/viewitem.php?itemid=2158511108981584 I also note this package: http://rays.openrays.org/rays/pool/main/u/unihan-wtpen/ I haven't looked inside (no top-level useful info), so I don't know if it's the recogniser we can see pictures of in the PDF. But it might be worth a look. Mitch. Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.