ChristopherB Posted November 5, 2008 at 07:50 AM Report Share Posted November 5, 2008 at 07:50 AM I'm aware it's difficult for software to always perfectly convert one character set to another, due to the lack of a one-to-one mapping between every tradtional and simplified character. My question is, does one way have a better chance of correct conversion over another; that is, is traditional to simplified likely to be more accurate than the other way around? Quote Link to comment Share on other sites More sharing options...
skylee Posted November 5, 2008 at 08:49 AM Report Share Posted November 5, 2008 at 08:49 AM is traditional to simplified likely to be more accurate than the other way around? Yes. Because 1 simplified character might represent more than 1 traditional character. Examples include 只, 发, 于, 云, etc. Quote Link to comment Share on other sites More sharing options...
trevelyan Posted November 6, 2008 at 10:54 AM Report Share Posted November 6, 2008 at 10:54 AM If you actually have to deal with Chinese data processing, you should probably just use existing tools to make the problem go away. Even if you store all of your own data in traditional you'll have to deal with simplified input at some point. Best solution is to use software that does word and phrase level character conversion instead of character-level conversion. Adso takes care of this and can be downloaded from http://adsotrans.com/downloads/. Google is getting better at it too. Quote Link to comment Share on other sites More sharing options...
westmeadboy Posted September 3, 2009 at 06:47 AM Report Share Posted September 3, 2009 at 06:47 AM Given one traditional character, is there exactly one corresponding simplified character? Quote Link to comment Share on other sites More sharing options...
imron Posted September 3, 2009 at 08:15 AM Report Share Posted September 3, 2009 at 08:15 AM Not necessarily. One example I can think off the top of my head is 乾, which is pronounced both gān and qián. For the pronunciation gān the character is simplified as 干. For the pronunciation qián the character maintains its original form. So, in a given piece of Traditional Chinese text, when converting to Simplified the character 乾 will sometimes be 乾 and sometimes be 干. Quote Link to comment Share on other sites More sharing options...
westmeadboy Posted September 3, 2009 at 08:21 AM Report Share Posted September 3, 2009 at 08:21 AM How about if you are given a traditional character AND its pinyin? Quote Link to comment Share on other sites More sharing options...
imron Posted September 3, 2009 at 12:17 PM Report Share Posted September 3, 2009 at 12:17 PM Not sure, however the first question that would spring to my mind is how was that pinyin created? If it was generated, then it might also suffer from inaccuracies. Just thought of another character too - 麽 which maps to both 么 and 麽 (but also with different pinyin). Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.