Jump to content
Chinese-Forums
  • Sign Up

Recommended Posts

Posted

Does anyone know of any lists of word frequency for Chinese? The kind of thing that tells you that 的 is the most common Chinese word, 今天 is the 97th, and 麟 is the 1054th?

Note, I'm not looking for character frequency, but word frequency.

I guess an HSK vocab list would be a step in the right direction - presumably the words at the lower levels are more frequent.

Roddy

Posted

Sorry, I don't know of a word frequency list - but do you mean that you already know of a character frequency list (on the web preferably)? If so, I'd be interested to know where.

As for the HSK, I guess you are right. But they also seem to be quite selective about the vocab they use. I don't have the list to hand, but I remember that dentist wasn't on there (although doctor was), for example.

Posted

No, I have never seen one. I think you would be lucky to find that. It is too complex to put together and would be based too much on peoples own opinions and language styles rather than researchable evidence.

As you say, there are many character frequency lists.

As long as you can master 90% of the contents of an HSK dictionary then you will be fine.

Posted

JoH, go to Zhongwen.com and click on Character frequency under Vocabulary - it's the best I know of.

I got some clues here - but it looks like most of the stuff is for characters only and the HSK word lists are still the best bet (unless you want to do something daft like actually pay for a book).

Roddy

  • 6 years later...
Posted

Wenlin has one. 的 is #1, 一起 is in the middle and 野 is at the end.

  • 1 month later...
Posted

Wow, the corpora search engine in that first link is pretty cool. That will certainly come in handy some day if I want to search for real-life sentence examples without stuffing around on Google. Cheers! :clap

Posted
tooironic: Wow, the corpora search engine in that first link is pretty cool.

I got the English examples to work. I tried some pinyin which worked. Is it possible to use Chinese characters in a corpus search other than simple words or characters. If so how about an example. I always wished Google would implement something like this.

xiele,

Jim

Posted

What do you mean? You can search by hanzi. E.g.

Posted

Maybe the issue is that you need put word breaks in the search terms. Unfortunately, it doesn't find a match if the words are not segmented the same way as the corpus. The corpus is segmented programmatically, so it may not be 100% perfect, either.

Posted

Hmm. I guess my question would be what possible pedagogical uses do these corpora have for the average student?

Posted
tooironic: What do you mean? You can search by hanzi. E.g.

For example I cant figure out the corpora syntax for conjunctions like 不但.*而且:wall

xiele,

Jim

Posted

Yes, the syntax is a little quirky; much of the example syntax doesn't work. But I managed to get something out this way:

不但 . . . . . . . . . . . . 而且

That will match with a gap of up to 12 words.

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...