Hanzi Visual Similarities - minimizing the pain of learning hanzi

January 5, 2007 at 08:35 PM

I spent a few days playing with a table of 6700 Chinese characters and observing their visual/phonetic similarities in a desperate search for some facilitation of my own study of characters (I know 1200 of them after one year)

The sinoling.com table of characters gives a hanzi sequence for each pinyin reading, for instance: BI bī 逼 bí 鼻荸 bǐ比笔彼鄙匕俾妣吡秕舭 bì 必毕币秘避闭壁臂弊辟碧拂毙蔽庇璧敝泌陛弼篦婢愎痹铋裨濞髀庳毖滗蓖埤芘嬖荜贲畀萆薜筚箅哔襞跸狴

I took each pinyin reading (throwing away the tones for the reason of simplification) and re-grouped the characters for their visual similarities to become apparent. The above example for BI-characters after my rearrangement looks like this:

BI

畀bi:鼻濞痹箅

比bi:妣吡秕舭毕毙庇陛篦蓖荜芘筚跸狴哔

必bi:秘泌铋毖

辟bi:避壁臂璧薜襞嬖

敝bi:弊蔽

笔bi:滗

(卑bei):埤俾婢裨髀庳萆

(孛bei):荸

orphans:拂弼愎贲逼碧彼鄙匕币闭

The level of similarity was decided by myself and may have absolutely no linguistic and etymological justification.

Having done this with all 6700 characters in the table, I ended up with a list of generic characters (or rather quasi-generic) and their corresponding tail sequences of "derived" characters. When no visual similarity was found I called that character orphan and put it aside. I also analyzed the orphans and if possible I attached them non-phonetically to existing sequences - eventually reducing their number to a little over two hundred. Finally, I had 1432 sequences and 217 orphans sieved from those 6700 characters. I throw away some of them to optimize for learning the first 4000.

I found that in order to read or systematically learn the first four thousand characters I need to make myself familiar with only 848 generic characters and 217 orphan characters. Synergically, in learning this group of 1065 characters I will in fact be prepared for recognizing another 4500 characters which is a total of 5500+!

I generated a XLS file with my master database of all re-arranged characters as well as the two basic lists. The file can be downloaded from http://otaflegr.com/chinese/hanzi-similarities.zip (101 kB)

For hanzi translations in my lists I made use of the files on http://lingua.mtsu.edu/chinese-computing/statistics/

January 6, 2007 at 11:55 AM

Thank you for sharing your ideas about a system of generic characters/elements to simplify the study of the Chinese writing system.

You or somebody else interested in this kind of study might also want to see Mary Ansell´s STRUCTURAL GROUPS TABLE.

Ole

January 6, 2007 at 02:08 PM

I took each pinyin reading (throwing away the tones for the reason of simplification) and re-grouped the characters for their visual similarities to become apparent.

I think what you've done is mainly to identify 'sound loan' characters that are still valid under modern mandarin pronounciation.

If there's someone out there who knows how chinese pronounciation has changed since the period when characters were invented, you might well be able to find more relationships that would help people remember characters.

I suggest this because I've heard that languages sounds change according to general 'rules'. For example, 辟劈, pronounced pi, are clearly related to 避壁臂, all pronounced bi - presumably because some they were all pronounced the same/(more similar?) ways at the time these characters were invented.

January 6, 2007 at 07:17 PM

Ole, thanks for the interesting link.

January 7, 2007 at 03:48 AM

Nice post otaflegr. You may also be interested in the Heisig method(sorry no link - computer problems). It only teaches how to write the character when prompted by its meaning, but after that recognizing the characters, adding pronunciations and additional meanings is greatly simplified. Learning all 4000 in a year would be possible (approximately 400 hrs).

July 23, 2011 at 01:51 PM

Hi otaflegr,

Also interested in your work, but the link to your excel file is not longer valid. Can you upload the file again please, thanks.

July 24, 2011 at 03:10 PM

@Otaflegr: If you check resources such as zhongwen.com or smarthanzi.net, you'll be able, via the phonetic breakdowns and components they provide, to reunite many of the "orphans" with their parents.

July 24, 2011 at 11:15 PM

Phono-semantic compound characters "are often called radical-phonetic characters. They form the majority of Chinese characters by far—over 90% [1]" but notice not all of them are accurate. Out of all these only 58% (about 3688) have both the matching initials and finals. The second largest group of PS Characters make up about 13% (819), this group of PS characters have matching finals but only a close pronunciation of the initial. Basically the PS characters are rated according to 6 levels:

*NOTE: This has not taken tones into account

类别 - 字数 - % - 累计%
0 声韵全同 - 3688 - 58 - 58

1 韵同声近 - 819 - 13 - 71

2 韵同声异 - 782 - 12 - 83

3 声同韵异 - 376 - 6 - 89

4 声或韵近 - 485 - 7 - 96

5 声韵全异 - 250 - 4 - 100

Chinese source: http://chinese.exponode.com/2_1j_1.htm

- - - - - -

Rough Translations

- - - - - -

Type - Number - % - Total%

0 Matching Initial + Final - 3688 - 58 - 58

1 Matching Final + Close Initial - 819 - 13 - 71

2 Matching Final + Different Initial - 782 - 12 - 83

3 Matching Initial + Different Final - 376 - 6 - 89

4 Close Final/Initial - 485 - 7 - 96

5 Different Final + Initial - 250 - 4 - 100

Example of Level 0

http://www.mdbg.net/chindict/chindict.php?page=worddict&wdrst=0&wdqb=p%3Afang

Often times pictophonetic characters are obsoleted when the Chinese language change, and afterward they usually become characters that give no Semantic meaning or a phonetic element. Some of the simplified characters have their phonetic element replaced by another with close pronunciation and less strokes. However more characters became accurate after the simplification.

July 25, 2011 at 06:48 AM

No I was looking for a list showing similar characters. I'm creating my personal flashcards deck. An example would be :: 我 and 找 maybe this is the wrong list?

July 25, 2011 at 05:29 PM

We have a different thread on characters that look similar. That might give you some ideas as well.

If you put them together into a flashcard deck, please post, it would be useful for others.

Sign In

Hanzi Visual Similarities - minimizing the pain of learning hanzi

Recommended Posts

otaflegr

Ole

onebir

otaflegr

leosmith

slabo

Gharial

DespikableMi

slabo

jbradfor

Join the conversation