mph Posted September 22, 2006 at 12:44 PM Report Share Posted September 22, 2006 at 12:44 PM I am new using adso. I have tried using adso trans to look up a few 漢詞 that have multiple pinyin and meaning. for example 人家 according to 現代漢語詞典 has two meanings they are.... ren2jia5 other people [pronoun used to refer to people other than yourself] ren2jia1 [noun] house or dwelling There are hundreds of Chinese words such as 人家 that have multiple pronunciations and meanings. Some even have three pronuncations and meanings. So far, using adso, I have only been able to find a single pronuncation and meaning for each of the multi pinyin Chinese words. How can I access the different pronunciations and meanings in adso? Hope someone can put me right. mph Quote Link to comment Share on other sites More sharing options...
trevelyan Posted September 23, 2006 at 11:25 AM Report Share Posted September 23, 2006 at 11:25 AM If you want to see the various options for any single entry you can always check the backend dictionary: http://www.adsotrans.com/adso/uniedit.pl Or do you mean an annotation mode which outputs all of the various possibilities? Could set something like that up fairly easily, but that wouldn't be much help for duoyinci, etc. Quote Link to comment Share on other sites More sharing options...
mph Posted September 23, 2006 at 02:41 PM Author Report Share Posted September 23, 2006 at 02:41 PM I have added 人家 ren2jia5 other people to the adso database. The database should now contain two entries for 人家. The new entry 人家 ren2jia5 is HSK 丙 . The other existing entry 人家 ren2jia1 is HSK 乙 Looking back though my Chinese primary school textbooks I learnt 人家 ren2jia5 in 2nd term primary 2. Not sure how such a fundamental word, in the first few most important words, is not be in the adso database? So far I have also not been able to find any other multisound Chinese words with more than one Chinese character in the adso database. For example 當時 has two entries in 漢語現代詞典 dang1 shi2 and dang4 shi2. Both have very different meanings and are quite common. Could be the people adding entries don't know how to add multisound multicharacter words. I have tried to add ren2 jia5 but when I paste 人家 into the www.adso.com home page I only get the one entry ren2jia1. What am I doing wrong? Quote Link to comment Share on other sites More sharing options...
trevelyan Posted September 24, 2006 at 05:56 AM Report Share Posted September 24, 2006 at 05:56 AM You're not doing anything wrong, mph. Assuming you added the second definition through the uniedit.pl page, it will take a bit of time for the system to convert it to UTF8 and for the entry to start working with the main interface. There are lots of duoyinci already in the database. The last one which comes to mind is 差使, although that's because it was the last one I personally added. The database is not as comprehensive as the commercial dictionary mentioned above in this area, for obvious reasons. We are working on that.... The easiest way to help the software differentiate between multiple definitions for words with the same part of speech is to familiarize yourself with the CODE field. This allows users to add heuristics on contextual usage to the software. I've taken a quick stab at making a first approximation for 人家. There is a thread about the CODE field further down this forum. Most users do not contribute. And most contributors don't worry about adding the pinyin for new entries. This is just the nature of the beast. The important thing is that we are steadily identifying issues and correcting them, and using a review process to prevent new entries from creeping in. The system will automatically generate pinyin for words missing it, taking advantage of what it already knows about the variant pronunciations of individual characters. This has the advantage that it isn't really necessary to explicitly add the pinyin for words unless the software generates them incorrectly. The best way to help is always just to either make corrections to errors you spot, or point out any problems to others so that they can get fixed. I'm happy to do my best to fix segmentation and sense-disambiguation issues as they are brought to my attention. Best, --trevelyan Quote Link to comment Share on other sites More sharing options...
mph Posted September 24, 2006 at 02:40 PM Author Report Share Posted September 24, 2006 at 02:40 PM There are lots of duoyinci already in the database. The last one which comes to mind is 差使, although that's because it was the last one I personally added. The image below shows the search outcome for 差使 on the http://www.adsotrans.com/adso/uniedit.pl link you supplied. Clearly there are two database entries for 差使. I then looked up 差使 at www.adsotrans.com and recieved the following image that only contains one of the database entries. I am still confused about why the second image does not contain both database entries for 差使. The second image is probably the one seen by most of your users. May I suggest that the second image include all database entries to aid the user. mph Quote Link to comment Share on other sites More sharing options...
trevelyan Posted September 26, 2006 at 08:28 AM Report Share Posted September 26, 2006 at 08:28 AM The software analyses the grammatical context when deciding which definition / entry to suggest. It tries to pick the single best definition for each word rather than simply throwing everything at the user. If it would be useful to you I'd be happy to add an output option that flushed all possible entries. It would get pretty messy for duoyinci or common characters though.... Quote Link to comment Share on other sites More sharing options...
zhwj Posted October 17, 2006 at 09:58 AM Report Share Posted October 17, 2006 at 09:58 AM It might be enough to simply output all entries automatically when the source text corresponds to a single entry in the database - in other words, when the user wants to use Adso as a dictionary rather than a gloss engine. I think this is what mph is looking for (his use of "look up" rather than "annotate"), and I'd imagine that a significant number of people coming to Adso for the first time might test it out with words rather than running text, and the results might appear to them to be inferior to those other annotation engines out there.... Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.