atitarev Posted December 19, 2005 at 10:26 PM Report Posted December 19, 2005 at 10:26 PM If you're learning/know both Japanese and Chinese, you will have noticed that some kanji, although have Chinese origin, have been simplified or deviated from the original or are using a different version of the same character [ki] 気 (Japanese) [qì] 气 (simplified) 氣 (traditional) [zasshi] 雑誌 (Japanese) [zázhì] 杂志 (simplified) 雜志 or 雜誌 (traditional) [ongaku] 音楽 (Japanese) [yīnyuè] 音乐 (simplified) 音樂 (traditional) The first character in the Japanese [mainichi]毎日 is not exactly the same as in Chinese 每天. I can't show here but the first character in the Japanese weekdays 曜日 [yōbi] looks also different from the Chinese version. The top doesn't seem to contain the 羽 radical. This is probably to do with the font used for Japanese (MS Mincho), not because it's different from Chinese, not sure. My observation is that out 2,500 most common Japanese characters those with deviations - about 30% (most common) match the Chinese simplified and about 70% match the Chinese traditional charcters. I haven't completed my analysis, I just do it in order not to forget Japanese, while learning Chinese and trying to map what I learned in both languages, so that I know the meanings, writing (modern Japanese, simplified and traditional) and different readings (Japanese ON and KUN and Mandarin Chinese). Very often there maybe various versions characters in both languages but only one is used: 毎 or 每. I know these are now completely different languages, in terms of gramamr, pronunciation and largely vocabulary but I am interested in mapping characters (J. vs C.), which you can do with Unihan dictionary or Firefox's Moji plugin, etc. Does anyone have an online (English?) description of those Japanese characters (different from Chinese) or about the Japanese simplification? I'd like to make a list of common Japanese characters that are different from Chinese - there won't be too many. If you know some examples, please post. Quote
perjp Posted December 21, 2005 at 09:06 AM Report Posted December 21, 2005 at 09:06 AM The Unihan database lists Japanese variants. A Japanese Kanji dictionary such as Kanjigen will also list variants. I'm sure I've seen Kanji dictionaries with complete lists as well. As for an online list, the closest I came is this page: http://www.aozora.gr.jp/kanji_table/touyoukanji_jitaihyou/ Simplified characters in Japanese are called 新体字, tradtional characters are called 旧体字. Googling that might give some results. You could also try 常用漢字字体表. Quote
atitarev Posted December 21, 2005 at 10:26 AM Author Report Posted December 21, 2005 at 10:26 AM That's a good link, thank you. Shame they are in image files, not as text, probably impossible otherwise in some cases. I found the following: 2501 most frequent kanji: http://www.harmful.org/homedespot/newtdr/NEWtdrARCHIVE/7diary/byfreq.html Jōyō kanji - Wikipedia, the free encyclopedia: http://en.wikipedia.org/wiki/J%C5%8Dy%C5%8D_kanji Interesting how Chinese 簡体字 and 繁体字 are similar to Japanese 新体字 and 旧体字. According to http://www.omniglot.com/writing/japanese_kanji.htm these kanji were invented in Japan but I found them in Unihan: 働く hataraku - to work 凪 nagi - lull 匂い nioi - smell 杢 moku - woodworker 御座 goza - mat 凧 tako - kite 辻 tsuji - crossroad 辷べる suberu - to slip Quote
yingguoguy Posted December 21, 2005 at 11:06 AM Report Posted December 21, 2005 at 11:06 AM After reading your post, I decided to play around with the unihan file to see if I could generate a list myself, as I recently started learning Chinese on top of Japanese myself. You indicated you'd done something similar, but this is what I came up with... A little explanation is probably required. The first number is the Henshall number (from Guide to Remembering Japanese characters which lists the joyo kanji in grade order), then the Japanese version of the hanzi/kanji. Then hanzi marked trad: or simp: are the traditional and simplified fields respectively, only listed when they're different from the Japanese. Now the strange part... There's also a field called zVariant which apparently gives semantic equivalents for the character. Some of these are funky old versions of the characters or simplified side radicals, but you need to use it to get the really interesting trio's of characters like: bad (warui) Japanese: 悪 Simplified: 恶 Traditional: 惡 and the ki/qi example you mention above. (This is my first post, sorry if my hanzi get mangled) These's variants are listed under zvar, with ztrad and zsimp the traditional and simplified versions respectively. I've had a look through the file and it seems like a reasonable attempt, though there's quite a few strange zvar chars in there. There might be some way to strip these out using other unihan fields. Also quite a few of the variants look very similar, this might just be down to the font. I've also got a version of this file with on,kun,pinyin and definitions in it, but I had to strip these in to get in under the file size limit. variants.doc Quote
malinuo Posted December 21, 2005 at 05:50 PM Report Posted December 21, 2005 at 05:50 PM Also check http://en.wikipedia.org/wiki/Unihan for characters with different writing styles but the same unicode. Quote
atitarev Posted December 21, 2005 at 11:47 PM Author Report Posted December 21, 2005 at 11:47 PM Thanks for your post, yingguoguy, I see you do a similar analysis to mine, thanks for the file. I can send my spreadsheet with analysed differences, processes 2501 characters. bad (warui) Japanese: 悪 Simplified: 恶 Traditional: 惡 Dificulty with finding out this is that all versions of the character also exist in other language but not used. For example: WARUI character exists: 悪 [è] [wù] (variant of 恶惡) It would be hard to find them character by character. What I was looking for is some kind of list/database of differences or a conversion tool that can convert from Chinese (simplified/traditional) to modern Japanese and vice versa. I'll keep looking, maybe someone else will post. I've got the answer for 曜 character, thanks to Perjp's post. The 羽 radical is simplified in Japanese. The feathers are straight, not on an angle in modern Japanese. Malinuo, thanks for the link. Firefox plays up, shows all the characters the same. In Internet Explorer some Korean characters are displayed as squares, although I installed the support for Korean. Quote
yingguoguy Posted December 22, 2005 at 01:37 AM Report Posted December 22, 2005 at 01:37 AM Atitarev, I'd be interested in seeing your spreadsheet. Please could you send it to me. Isn't it just the Joyo kanji that are officially simplified in Japan? Anything that's not on that list should be the same as the traditional form? Of course you might have multiple forms of the same traditional character but I don't think there's going to be any decision on these about which one is the 'official' one. You could do matching with words on Edict/Cedit to see which one's they prefer, or get a frequency list from somewhere, but I don't think you can get this perfect with a computer. I remember somewhere reading that most heavyweight Chinese dictionaries contain all the Japanese-only characters, as they're considered part of the same family. I wonder if there's another systematic list of Japanese-only characters anywhere. Henshall has 畑 (hata) as a Japanese-made character, though one that's been adopted by the Chinese as well. Malinu, I can't see the wikipedia entry, as I'm in China and it's blocked. This really annoys me as I want to use it mostly for Chinese language learning, so the government is really poking itself in the eye :-). Is it something that can be cut and pasted? Quote
malinuo Posted December 22, 2005 at 07:36 AM Report Posted December 22, 2005 at 07:36 AM I copied the source of the wikipedia article to http://lewan.chez-alice.fr/hanuni.html . Someone with better knowledge of this forum and its HTML capabilities may be able to copy it straight here instead. Quote
roddy Posted December 22, 2005 at 08:00 AM Report Posted December 22, 2005 at 08:00 AM I think copying and pasting that page in would just result in a mess, one way or another. I had to strip these in to get in under the file size limit. If it's possible to break it down into two or more files, you can do that. Otherwise, you can email it to me (admin@ ) and I'll upload it manually. Roddy Quote
roddy Posted December 22, 2005 at 10:13 AM Report Posted December 22, 2005 at 10:13 AM Attaching the above mentioned file - I've also increased the limits for .doc and .txt files. Roddy variants.txt Quote
atitarev Posted December 22, 2005 at 11:41 AM Author Report Posted December 22, 2005 at 11:41 AM As discussed, I am uploading the file with 2501 most common Japanese characters - their simplified and traditional equivalent in Chinese. The Japanese character column is original (unchanged) from http://www.harmful.org/homedespot/newtdr/NEWtdrARCHIVE/7diary/byfreq.html By simply pasting the Japanese characters into Wenlin and converting into simplified - I got the simplified Chinese, and the same method for getting the Chinese traditional.It doesn't convert properly, e.g. 悪 stays the same, not converted to 恶 or 惡, because 悪 is just a "variant". The other columns show whether the Japanese matches the Chinese simplified or traditional (true/false). I only use it for my reference, obviously, it doesn't show, which character is not in use in Chinese, like the example above and others, so I don't guarantee the accuracy of the Chinese part. The part 2 of the file (incomplete) has Japanese characters with all readings and translations, I added 500 Chinese characters (simpl./trad.) + all pinyin readings, not attaching here yet, I still have to do the remaining 2000 - I can't just copy and paste, because of the merged cells, it's a bit time-consuming to allign them: To give you an idea what it looks like (misaligned here) # Kanji ON KUN Meaning S. T. Pinyin 1 日 ニチ ひ day 日 日 rì ジツ -び sun -か Japan 2 一 イチ ひと- one 一 一 yī イツ ひと.つ 3 国 コク くに country 国 國 guó 4 会 カイ あ.う meeting 会 會 huì, huǐ, kuài エ あ.わせる meet あつ.まる party association interview join BTW, CQuickTrans allows to get characters with all readings but I have trouble copying from the trial version. Thanks to Roddy for increasing the file size limit. 2501 most frequent kanji (with Chinese) part 1.xls Quote
yingguoguy Posted December 22, 2005 at 02:40 PM Report Posted December 22, 2005 at 02:40 PM 社 shows up with 示 on the left side in Excel (i.e. as an old variant), though when I paste it into this message it corrects itself. The traditional and simple ones are 社 though. Any idea whats going on here? My understanding is that 悪 is the Japanese form, 恶 is the simplified form, and 惡 is the traditional form of the character. So writing 恶 or 惡 in Japanese (or 悪 in Chinese) would be strange, though probably understandable. I think these forms need to be in the list. 悪 isn't in my (New Century) Chinese dictionary. As another example you have 歩 is being the same for all, where as AFAIK both the simplified and traditional forms would be 步. This confused the hell out of me at first, as I would be corrected by a Chinese friend to one version, then a few weeks later a Japanese friend would correct me back. The Japanese version seems more intuitive to me, and even a Taiwanese friend said she was confused by why this character was like that. However it's exactly these kind of differences which I'm interested in. This difference isn't really shown properly in my version either yet, as it's easy to assume that 步 is some weird old form. Automating this gets you probably 90% of the way there, however I think someone needs look at the lists and find the most suitable form for each locale. It's be nice to have a comprehensive list. I'm happy to go through and merge changes between the two versions. Can we co-operate on this? If it's the formatting of the speadsheet that's taking all your time then if you wanted to write a text file I could convert it into a nice Excel spreadsheet. If it's the cutting and pasting though I guess it won't help you much. I hadn't seen Wenlin before, I played with the demo and it seems pretty cool though. Is it worth $200+ though? If so I may have found my Chistmas present to myself. Merry Christmas Quote
yingguoguy Posted December 22, 2005 at 02:47 PM Report Posted December 22, 2005 at 02:47 PM Malinuo, thanks for posting that link. I appreciate it. Quote
atitarev Posted December 28, 2005 at 06:35 AM Author Report Posted December 28, 2005 at 06:35 AM Thanks, Yingguoguy for the interesting post. It's really difficult to sometimes trace all differences, that's why I started the thread. Compare the following too. The Japanese 専 is different from both 专 and 專, although looks more like the traditional. I don't seem to have the tool showing, which version of the character in Japanese is modern and, which one is traditional. Pasting Chinese traditional 專 into Japanese NJStar doesn't cause any problem and show character info but doesn't show any dictionary entries with it. 専業 sengyō (J) 专业 zhuānyè (S) 專業 (T) Quote
yingguoguy Posted December 28, 2005 at 11:01 AM Report Posted December 28, 2005 at 11:01 AM I've moved my list into a spreadsheet and gone through it trying to work out the variants. I don't really know too much about traditional characters so some of them might be really old variants and not match current usage. There's bound to be some mistakes as well, so I'd appreciate any corrections. I've put the Chinese and Japanese columns into different fonts because some characters, like 直, show up differently even though unicode has them as the same characters. As you say the trio 専 專 专 gets very confusioning. 博 of 博物馆 is the same in all three variants, as are 縛 and 薄. And you also have 転 轉 转 which I'd never have spotted if I'd not done this. Also I'm sure I've seen the traditional 專 in a Japanese character somewhere, though I might just be getting confused with 偶 variants.xls Quote
Yuchi Posted December 28, 2005 at 01:58 PM Report Posted December 28, 2005 at 01:58 PM Are common japanese people able to read 旧字体? Or is it like mainland China, where certain portions of the population can't read traditional? Quote
yingguoguy Posted December 28, 2005 at 02:55 PM Report Posted December 28, 2005 at 02:55 PM Not too sure about that, but I'll hazard a guess. All of Japan has gone with the same simplifications, so you don't get the whole mainland/overseas split you have with Chinese, so the only place you're really going to see old style characters is in pre-1945 literature, enscriptions etc. The simplifications in Japanese are much less far-reaching than the Chinese mainland simplifications, and there are only a few characters which look significantly different from their old forms. I'm guessing that there's only a going to be occassional times when a Japanese person would hit a character they don't know and can't figure out from the context. I think they might struggle to write most of the traditional ones though. Oh, looking at your post again, are you asking whether they can read Japanese written in the old style or by 旧字体 do you mean traditional Chinese? They'd probably be happier reading traditional Chinese, but there are a hell of a lot of characters not in use in Japan, and grammar would be a problem. Quote
atitarev Posted December 28, 2005 at 10:03 PM Author Report Posted December 28, 2005 at 10:03 PM I get impression that Japanese sometimes mix up the modern and traditional characters. NJStar dictionary has entries with 國, although it's never used as 国 (the modern simplified) and some comics use 蟲, not 虫. I wonder why a lot of simplifications in Japanese and Chinese coinside? 会, 虫, 国, 声, 体, 台, 昼, 当, 点, 同, 麦, 来, 里, 医, 温, 向, 号, 写, 面, 礼, 变, 学, etc. They must have borrowed from each other, or as someone mentioned, simplified characters existed for quite some time as shortcuts. I find writing the traditional version of the above characters by hand quite challenging. Quote
yingguoguy Posted December 29, 2005 at 02:26 PM Report Posted December 29, 2005 at 02:26 PM Quote from Modern Chinese Characters by Yin Binyong Chinese characters have a long history of simplification, which in fact started almost from their beginning. We find simplified characters even in the oracle bone inscriptions of the Shang Dynasty three thousand years ago. We may say that Chinese characters have undergone a ceaseless process of simplification, from complex to simple. starting from the time of the oracle bone inscriptions, down through the bronze inscriptions, seal script, official script and standard script ... These simplified character forms generally were produced by the populace and enjoyed currency among them in their everyday hand-writing, but were never officially acccepted by scholars or the government, who refered to them as 俗字 (su2 zi4) or vulgar characters He goes on to list about a hundred, but here's a quick selection based on those I can remember the pinyin for Pre-Qin period (before 246 BC): 从尔个气网云虫... Han Dynasty (206 BC - 200 AD): 达台万号... Wei, Jin and North South Dynasties (220 - 581 AD): 声双笔... Sui-Tang Period (581 - 907 AD) 干 Song-Yuan Period (950-1368 AD) 当点 Ming-Qing Period (1368-1911 AD) 鱼门几桥么 It seems like a lot of the simplified forms existed even before kanji arrived in Japan, both the Japanese and Chinese simplication programmes were mostly about choosing the easiest to learn version of existing characters, and so a lot, but not all, of the time they ended up choosing the same. Quote
Lugubert Posted February 13, 2006 at 03:19 PM Report Posted February 13, 2006 at 03:19 PM I hope that I'm not deviating too much from the subject of this thread, but does anyone out there know if there are search tools to be had for the Mojikyo fonts, other than radical + stroke number? Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.