Jump to content
Chinese-Forums
  • Sign Up

Implementing Pleco-like Handwriting Recognition


Recommended Posts

Posted

There are lots of programs that let you "write" a character on screen with your mouse or finger and are able to recognize the character you're writing. Pleco, Nciku, Skritter and a TON of others online.

How do they do that? Is everybody using a database or library that I don't know about?

Thanks!

Posted

We license ours from a company in China called Hanwang (best known these days for their e-readers); their technology is spectacular / best-in-class, but it's quite expensive and probably not practical unless you start charging subscription fees (and get lots of paid subscribers). A number of other companies in China also make handwriting recognition engines, but the cheapest we've found anywhere was about $8k/year. (unfortunately that bid was confidential so I can't say which company we got it from)

There are, however, a couple of open-source Chinese handwriting projects floating around that you could take a look at; just do an internet search and you should turn up a few. I can't say as I've done much comparison-shopping among them but I believe there are a few other websites / apps using them now and they're getting better and better. You might find that your own character data / character-indexing efforts are also helpful here.

Nciku used to have an "in collaboration with UniHan" banner on their handwriting box, so I'd suggest trying to figure out which company in China makes UniHan and inquiring about licenses from them. In the case of Skritter I believe Nick wrote their algorithm himself rather than licensing it from someone - that's not an impossible task, but it'd probably take at least a year or two of concentrated effort to get something usable.

  • Like 3
Posted

I think Chinese-tools.com and popupchinese also have recognition engines - not sure what they're based on.

Posted

It would appear that this type of software somehow parses the character's stroke order and direction of each stroke. Check out http://guide.wenlininstitute.org/wiki/Character_Description_Language. You may be able to dig up some library of character stroke orders somewhere or do it from first principles, e.g. from Unihan data at http://www.unicode.org/charts/unihan.html (by the way, it would appear that Unihan is based in the US or wherever Unicode itself is based).

Happy hunting.

PS: Also, in case you don't see them you'll want to check out the links below:

http://www.wenlin.com/cdl/cdl_spec_2003_10_31.pdf

http://www.wenlin.com/cdl/cdl_strokes_2004_05_23.pdf

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...