valikor Posted April 16, 2010 at 12:46 PM Report Posted April 16, 2010 at 12:46 PM I am interested in a character list (based on frequency, HSK level, something like that--doesn't really matter) with a good list of relevant words for each character. The zein.se site has approximately what I want, but the list of words for each character is not very good... it usually has 3-4 at most, often lacking very important and common words. Even if someone has a recommendation for a reliable site for searching "words contatining a character" (reliable in terms of showing the most common words, and with good english translations), I wouldn't be opposed to looking up each character individually and compiling the list myself. This might be good review! The same (or similar) question was asked here http://www.chinese-forums.com/index.php?/topic/2-favourite-chinese-musician0343 a long time ago, but the answers don't quite get at what I'm looking for, I think. ANy help is appreciated, thanks! David Quote
Guest realmayo Posted April 16, 2010 at 01:51 PM Report Posted April 16, 2010 at 01:51 PM Did you try the first link in the first comment of the post you linked to? It goes here: http://www.chinese-forums.com/vocabulary. If you type a character, put a * either side of it (I forgot to add the *s first time round), select phrase length all, HSK level all, output format list ... then you might get what you're looking for. I just did a test with 分 ( ie *分*) & got 51 results back: ie which should be all the words that contain 分 in the HSK vocab lists. Not sure if this provides what you're after? EDIT: rereading your post, it looks like this only provides a partial solution (the second part) to your question. Quote
renzhe Posted April 16, 2010 at 03:03 PM Report Posted April 16, 2010 at 03:03 PM http://hmarty.free.fr/hanzi/ It has lists of characters sorted by frequency and HSK level, and for each character, it has a list of variants and common compounds. It also includes a radical index. It's based on an old version of CEDICT and is not actively maintained, but it is still very useful for character (字 as opposed to 词) research. Quote
c_redman Posted April 16, 2010 at 05:35 PM Report Posted April 16, 2010 at 05:35 PM There was an attachment (flash.txt) in the thread you linked to, which seems to fit what you were asking for. What do you need that is different from that list? Is it the ordering by character and word frequency that's missing? It would be easy enough to generate anything using the most recent CC-CEDICT, as long as the format is spelled out. Quote
xiaoxiaocao Posted April 17, 2010 at 04:03 AM Report Posted April 17, 2010 at 04:03 AM I'm having some trouble getting the .txt file from the other thread to work on Pleco. Not much of a techie I'm afraid if anyone could offer any suggestions it would be much appreciated. flash.txt Quote
roddy Posted April 17, 2010 at 04:19 AM Report Posted April 17, 2010 at 04:19 AM Spend half a day playing with databases and scripting languages - MySQL and PhP is what I use - and you can have local copies of Unihan, the HSK lists, CEDICT, Adso, etc, spitting out lists like this any time you want them. Quote
c_redman Posted April 19, 2010 at 12:55 PM Report Posted April 19, 2010 at 12:55 PM Roddy, your original flash.txt is in GB encoding. If you put up a utf-8 version, it may be better for importing into various tools. Quote
valikor Posted April 22, 2010 at 03:33 AM Author Report Posted April 22, 2010 at 03:33 AM Just wanted to thank everyone for their helpful replies. You've all pointed me to more or less what I was looking for. Thanks! Quote
xiaoxiaocao Posted April 22, 2010 at 04:16 AM Report Posted April 22, 2010 at 04:16 AM Here's the list in word format in case anyone needs it. Thanks to all. flash (2).doc Quote
mihobu Posted April 26, 2010 at 10:20 AM Report Posted April 26, 2010 at 10:20 AM I've been working on something like this that sorts words by character and word frequency, and introduces new words only after all the component characters have appeared. It's available for download at http://monkeywalk.com/wordlist. Michael Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.