Jump to content
Chinese-Forums
  • Sign Up

Character list -- words containing each character


Recommended Posts

Posted

I am interested in a character list (based on frequency, HSK level, something like that--doesn't really matter) with a good list of relevant words for each character.

The zein.se site has approximately what I want, but the list of words for each character is not very good... it usually has 3-4 at most, often lacking very important and common words.

Even if someone has a recommendation for a reliable site for searching "words contatining a character" (reliable in terms of showing the most common words, and with good english translations), I wouldn't be opposed to looking up each character individually and compiling the list myself. This might be good review!

The same (or similar) question was asked here http://www.chinese-forums.com/index.php?/topic/2-favourite-chinese-musician0343 a long time ago, but the answers don't quite get at what I'm looking for, I think.

ANy help is appreciated, thanks!

David

Posted

Did you try the first link in the first comment of the post you linked to? It goes here: http://www.chinese-forums.com/vocabulary. If you type a character, put a * either side of it (I forgot to add the *s first time round), select phrase length all, HSK level all, output format list ... then you might get what you're looking for. I just did a test with 分 ( ie *分*) & got 51 results back: ie which should be all the words that contain 分 in the HSK vocab lists.

Not sure if this provides what you're after?

EDIT: rereading your post, it looks like this only provides a partial solution (the second part) to your question.

Posted

http://hmarty.free.fr/hanzi/

It has lists of characters sorted by frequency and HSK level, and for each character, it has a list of variants and common compounds. It also includes a radical index.

It's based on an old version of CEDICT and is not actively maintained, but it is still very useful for character (字 as opposed to 词) research.

Posted

There was an attachment (flash.txt) in the thread you linked to, which seems to fit what you were asking for. What do you need that is different from that list? Is it the ordering by character and word frequency that's missing? It would be easy enough to generate anything using the most recent CC-CEDICT, as long as the format is spelled out.

Posted

I'm having some trouble getting the .txt file from the other thread to work on Pleco. Not much of a techie I'm afraid if anyone could offer any suggestions it would be much appreciated.

flash.txt

Posted

Spend half a day playing with databases and scripting languages - MySQL and PhP is what I use - and you can have local copies of Unihan, the HSK lists, CEDICT, Adso, etc, spitting out lists like this any time you want them.

Posted

Roddy, your original flash.txt is in GB encoding. If you put up a utf-8 version, it may be better for importing into various tools.

Posted

Just wanted to thank everyone for their helpful replies. You've all pointed me to more or less what I was looking for. Thanks!

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...