大块头 Posted December 8, 2019 at 05:02 PM Report Posted December 8, 2019 at 05:02 PM To help those learning to read handwritten Chinese or trying to improve their penmanship, here is a compilation of handwritten Chinese characters for reference. It currently contains examples extracted from the HCL2000 database (PDF). I created an individual PNG image for 3755 Chinese characters, and each image contains 16 handwritten examples of that character selected randomly from one of 1000 different writers. For example, "好.png" is shown below. Find the Github repo here. Download a ZIP file of the images here. (28.8 MB) 2 Quote
大块头 Posted December 8, 2019 at 05:20 PM Author Report Posted December 8, 2019 at 05:20 PM Looking through the images, I don't see many examples written in a cursive style. I might expand the compilation to include examples from the HIT-MW or CASIA databases. Quote
imron Posted December 8, 2019 at 11:13 PM Report Posted December 8, 2019 at 11:13 PM The link to the HCL2000 database is broken. Quote
大块头 Posted December 8, 2019 at 11:15 PM Author Report Posted December 8, 2019 at 11:15 PM 2 minutes ago, imron said: The link to the HCL2000 database is broken. Fixed, thanks. 1 Quote
markhavemann Posted December 9, 2019 at 03:52 AM Report Posted December 9, 2019 at 03:52 AM This is really cool! I looked through and noticed that most look fine but it seems like some of the characters have been scaled in weird ways to fit the boxes. eg. 一.png It's not so obvious in most of them but really clear here. I'm not sure if this is something easy to fix or just worth living with. 1 Quote
Tomsima Posted December 9, 2019 at 12:06 PM Report Posted December 9, 2019 at 12:06 PM Just downloaded and took a look. That is a lot of ugly handwriting, it looks like a team of middle school students were given lines for punishment, which were then compiled into a database. Useful nonetheless I suppose ? Quote
imron Posted December 9, 2019 at 12:51 PM Report Posted December 9, 2019 at 12:51 PM 43 minutes ago, Tomsima said: Useful nonetheless I suppose Do you ever use handwriting recognition inputs on your phone? Modern recognizers are almost certainly built with a massive amount of samples such as these. So yes, very useful. Quote
大块头 Posted December 9, 2019 at 01:17 PM Author Report Posted December 9, 2019 at 01:17 PM 9 hours ago, markhavemann said: it seems like some of the characters have been scaled in weird ways to fit the boxes. I noticed that too with 一.png, but I didn't see any others with this issue? 55 minutes ago, Tomsima said: That is a lot of ugly handwriting, it looks like a team of middle school students were given lines for punishment, which were then compiled into a database. Useful nonetheless I suppose A lot of the handwriting in HCL2000 certainly looks no better than mine! The 1000 writers were selected to be from a variety of age and educational demographics though. It's not stated in the paper, but I think they instructed them to write in a 楷书 style, which may not result in good penmanship for most people if they are trying to write quickly. Quote
大块头 Posted December 9, 2019 at 01:26 PM Author Report Posted December 9, 2019 at 01:26 PM I'm currently working on a compilation based on the HIT-MW database, which was collected from people copying texts out longhand. I think the examples here are more reflective of the sort of handwriting you'd actually encounter in day-to-day life. 1 Quote
Tomsima Posted December 10, 2019 at 02:01 AM Report Posted December 10, 2019 at 02:01 AM I was focusing more on the comment "those looking to improve their penmanship", as while this is an excellent resource to understand written handwriting (and as rightly pointed out for the purposes of OCR data), its probably worth stressing that using these kinds of characters as models for studying and improving your own writing is surely not going to end well. I personally feel like my handwriting gets better when I spend a lot of time reading/looking at good handwriting/calligraphy, as I can see it in my minds eye more vividly when I am writing. I'm pretty sure if one were to spend a lot of time studying these kinds of characters for the sake of improving penmanship, you would end up with a disappointing mish mash of messy characters, perhaps worse than when you started. Quote
大块头 Posted December 10, 2019 at 02:28 AM Author Report Posted December 10, 2019 at 02:28 AM 27 minutes ago, Tomsima said: its probably worth stressing that using these kinds of characters as models for studying and improving your own writing is surely not going to end well Good point. I bought a copy of《3SFM实用硬笔字》as per @imron's suggestion. I was planning to use this compilation as a secondary reference while I worked through that book. Quote
imron Posted December 10, 2019 at 03:58 AM Report Posted December 10, 2019 at 03:58 AM 1 hour ago, Tomsima said: its probably worth stressing that using these kinds of characters as models for studying and improving your own writing is surely not going to end well. Agreed! My programmer mind completely skipped over the part in the very first sentence of this thread mentioning "trying to improve their penmanship". Quote
Tomsima Posted December 10, 2019 at 12:07 PM Report Posted December 10, 2019 at 12:07 PM 9 hours ago, 大块头 said: Good point. I bought a copy of《3SFM实用硬笔字》 Me too, picking it up when I'm back in China for spring festival ? Quote
大块头 Posted September 12, 2021 at 12:35 PM Author Report Posted September 12, 2021 at 12:35 PM I created an experimental 30,000-card Anki deck for recognizing handwritten characters using the gigantic CASIA dataset. I also put together a handwriting recognition test here. 1 Quote
黄有光 Posted September 12, 2021 at 12:39 PM Report Posted September 12, 2021 at 12:39 PM @大块头 Hey man, your lectures on youtube are so awesome! When are you planning on doing the next one? Quote
alantin Posted September 12, 2021 at 03:54 PM Report Posted September 12, 2021 at 03:54 PM On 9/12/2021 at 3:35 PM, 大块头 said: I created a 30,000-card Anki deck for recognizing handwritten characters using the gigantic CASIA dataset. I also put together a handwriting recognition test here. I got that today and did your recognition test too! Looking good! I got 65% right in the handwriting recognition test. Quote
大块头 Posted September 12, 2021 at 04:07 PM Author Report Posted September 12, 2021 at 04:07 PM On 9/12/2021 at 11:54 AM, alantin said: I got 65% right in the handwriting recognition test. It's been interesting to see the results as they come in. Native speakers have scored 35-39 out of 40, and CSL learners have scored 16-37 out of 40. 2 Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.