trevelyan Posted March 24, 2007 at 05:58 AM Report Posted March 24, 2007 at 05:58 AM Looking for thoughts/preferences on the usefulness of sticking entries in the Adso database that do not have english definitions.... The reason for this is that we've recently had an upswell in submissions from people who are providing Chinese words without English definitions. In the past I've tended to accept undefined proper nouns (names, etc.) when reviewing submissions but have generally tried to keep undefined compound words out or add a definition for them myself when processing user additions. The growing number of non-defined submissions is beginning to make this a bit unfeasible though. As I see it, it is definitely useful to have these sorts of words in the system for better segmentation. The downside is reducing the usability of the machine translation output, and having a system that won't provide English explanations for some words. I don't want to have to go through the work of creating a specialized (purged) database for the machine translation and annotation step (although I suppose that is a possibility). I also tend to think that having well-defined entries helps keep quality up since other people can edit an entry and understand why it was contributed. But there's definitely a trade-off here. Not sure what I should do, so thoughts are welcome. Quote
onebir Posted March 24, 2007 at 12:46 PM Report Posted March 24, 2007 at 12:46 PM Perhaps if you presented the untranslated words somewhere, people (other than the people who submitted them) would translate them for you? Quote
character Posted March 24, 2007 at 12:50 PM Report Posted March 24, 2007 at 12:50 PM It might be more work than it's worth, but have you considered changing the programs which read from the database to handle blank entries differently? Perhaps they could go and get the definitions of the individual characters and display them. Also, I'm not familiar with your submission process, but do you have the option of taking a Chinese definition for an entry? Having a Chinese definition could make it easier to come up with an English definition, or you could try to produce a machine translation of the definition. Of course, as noted, posting the list of blank entries could attract people to submit definitions. Quote
trevelyan Posted March 25, 2007 at 07:28 PM Author Report Posted March 25, 2007 at 07:28 PM @onebir -- excellent suggestion. @character -- the database is used by a number of applications, so changes in editorial style might affect third party applications. Will simply follow onebir's suggestion in the short-term. At least that way people know why their submissions are getting filtered out. Quote
roddy Posted March 25, 2007 at 11:51 PM Report Posted March 25, 2007 at 11:51 PM Putting them online so people can see them and hopefully deal with them is, I think, the best idea (which I see you've now done - 打打杀杀 is my new favorite word). If you were able to set up an automatically generated page which would pull out all the undefined words and create a list of them with a form field for the submission next to, that would be particularly helpful. People could then just scan through it, fill in the ones they know then submit. 'Course, there's a bit of work involved in doing that, and you may want to keep the barriers to submission a little high to ensure that people are actually considering what they are submitting. Also, I recall you had an RSS feed for new words once. Not sure if that's still running, but perhaps also an RSS feed for undefined words? That way we can all learn of them in the first instance and rush online to 抢答. Quote
novemberfog Posted March 26, 2007 at 01:07 PM Report Posted March 26, 2007 at 01:07 PM Totally off-topic (sorry!), but if anyone knows the actually true meaning of 打打杀杀, I would love to know. Just looking at it I can see it is "hit-hit-kill-kill", but I'd love to know if this is just something from a comic book or if it is an actual word. Quote
roddy Posted March 26, 2007 at 01:11 PM Report Posted March 26, 2007 at 01:11 PM Judging from a quick scan of Google's page 1 results, it seems to be something along the lines of 'gratuitous violence', particularly in computer games. Quote
trevelyan Posted March 27, 2007 at 04:33 PM Author Report Posted March 27, 2007 at 04:33 PM You insensitive bastards.... that's my mother's maiden name! Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.