Jump to content
Chinese-Forums
  • Sign Up

How many words does an average native speaker know?


Recommended Posts

Posted

I only cited estimates (based on dictionary sampling) provided by some renowned linguists, as referenced by another renowned linguist.

Any such study will begin by warning you about the problem of defining what a "word" is, and how you obtain the number. I believe that most of them ignored the declensions and conjugations of a word, but included all the different words from a family. Therefore, they are not counting lemmata, they are counting words. This can easily account for the factor-of-three difference.

A good check whether something is a word is to see if it has a separate heading in a dictionary. sharp, sharpen, sharpener, etc, are all different words with different meanings. Luckily for us, they are pretty easy to infer if you know the root and the suffixes. Any sufficiently educated native speaker will know well over 10,000 of those. A sufficiently educated Chinese person will have a passive knowledge of 4-5,000 characters alone, and that doesn't even start with the multi-character words which are the majority.

Based on counting, I should have a (passive) vocabulary of over 10,000 words / vocabulary items in Chinese, and I assure you that this is nowhere NEAR enough to understand just about everything on just about anything. It's not even close. I have collected well over a thousand non-trivial words by watching Fendou alone.

Posted
Memorizing the dictionary is almost just as useless, you know.

Actually it is the fastest and the most accurate way of learning a language but it needs P.A...T..........IE.... :unsure: ...........N.............. :wacko: .............CE :clap

  • Like 1
Posted
However, is any of this relevant to Chinese? Is there any reason to think that "word families" exist in Chinese and, if they do, that they correspond usefully with the amount and scope of word families in English? It seems to me that in Chinese you either descibe these "families" very loosely, eg anything with 友 in it belongs to the same family (which isn't so helpful because knowing 友 won't help you much with 朋 or 谊); or very tightly, limited to I guess certain suffixes such as 们、性、化 etc (朋友们、多样性、城市化).

In Chinese every word is a lemma. So, by comparing the number of lemmas an educated English speaker knows, then it is probably comparable to the number of lemmas someone knows in Chinese.

This works quite well with languages such as spanish and english, but I have a suspicion that it overestimates the number of chinese words you would have to know. Because chinese uses roots to form new "words" I think it cuts down on the actual number of words you need to know. English barely does this. For example, if you know extra and you know ordinary, you can probably guess the meaning of extraordinary. So, there are many cases in English where you would need to know 3 words, but Chinese only requires knowing 2 and the 3rd could be deduced with good accuracy.

Posted
English barely does this.

I wouldn't be so sure.

I have opened my Collins Compact at a random page, to find these words:

firearm

fireball

firebomb

firebrand

firebreak

firebrick

firecracker

firedamp

firedogs

fire-eater

fire-extinguisher

firefighter

firefly

fireguard

fireman

fireplace

fireplug

firepower

firedide

firetrap

firewall

firewater

firework

fireworks

And that's just starting with fire. Then there are compound vocabulary items which have a clearly defined meaning, but are written as separate words:

fire alarm

fire brigade

fire clay

fire door

fire drill

fire engine

fire escape

fire hall

fire hydrant

fire irons

fire raiser

fire ship

These would typically be seen as two words in English, but one 词 in Chinese.

So, there are many cases in English where you would need to know 3 words, but Chinese only requires knowing 2 and the 3rd could be deduced with good accuracy.

In my experience, this accuracy is very overrated. You need a lot of context, your best guess is often wrong, and you typically need to see it written, which introduces a lot of extra information, in terms of radicals, etc.

Best of all, you don't even know if it's a verb, a noun, an adjective or something else.

Posted
In Chinese every word is a lemma

Really? In that case I would have thought Chinese has far more "lemmas", for instance you'd have to include 吃饭, 吃苦,吃亏,吃惊,吃力,好吃,省吃俭用,吃醋,吃素 and this is just copying the top few from the Wenlin dictionary. And also my earlier examples 城市化 etc.

Posted

OK, I will complete my list in between 5 to 7 years depending on the quantity of the words(if I'm still alive :rolleyes: ). It's gonna be almost all the common words a Chinese knows plus some chengyu, literary and medical words(just some cuz I don't want lots of medical words on this list) and then we will have an idea about the number.

Posted
Really? In that case I would have thought Chinese has far more "lemmas", for instance you'd have to include 吃饭, 吃苦,吃亏,吃惊,吃力,好吃,省吃俭用,吃醋,吃素 and this is just copying the top few from the Wenlin dictionary. And also my earlier examples 城市化 etc.

Not necessarily because the difference between a lemma and one of its word forms is not in a change of lexical meaning, but instead it provides grammatical information. For instance, Spanish has nearly 50 forms of every verb. So, a native may know the verb correr (the lemma) and also know 50 other forms of the same word. English does this much less, so we might have 3 forms of the same verb. However, in order to accomplish the same thing Spanish does with 100,000+ different word forms we use only handful of auxiliary words.

I am not a master at Chinese. I am just preparing the system for me to learn Chinese. I would assume, as it would be extremely inefficient otherwise, that Chinese uses a relatively few number of auxiliary words to accomplish the same thing that English does using conjugations. For example, Chinese probably has a few words to indicate tense (past, present, and future) or mood (imperative, subjunctive, or indicative), etc.

I doubt that Chinese could have far more lemmas because unlike "word forms," which are based on a logical system, lemmas require far more memory and there would be diminishing returns for every lemma learned. So, a language can only have so many lemmas to be at a proficient level before the language ability of the dumbest few % would be pushed to the max. That is my opinion.

Posted

oh, so in English "urban" and "urbanisation" are two lemmas? I think I understand.

on a side-note, I wonder if it's actually at all useful to apply the concept of "lemmas" to Chinese.

Posted
In my experience, this accuracy is very overrated. You need a lot of context, your best guess is often wrong, and you typically need to see it written, which introduces a lot of extra information, in terms of radicals, etc.

Best of all, you don't even know if it's a verb, a noun, an adjective or something else.

Based on my limited analysis, I think a large percentage of chinese compounds are not obvious in the first few thousand words, but as the compounds are less frequent they become more obvious.

For example, 年产 (9765), 有望 (9766), 暂停 (9767), 条文 (9768), 无情, (9769). The number next to them represents frequency. I think these words would not be too difficult to figure out based on context and part of speech should be at least hinted at depending on the position in the sentence.

More importantly, there are a myriad of words like these that may not be close enough to the components to provide an exact definition, but ARE close enough to provide a mnemonic device for memory, which means the effort to remember them is far easier just like the effort to remember football is far easier than pusillanimous. That is my experience at least. I guess everyone's brain works differently.

  • Like 1
  • 1 year later...
Posted

Hm.

Your side-note about lemmas being useful in application to Chinese... likely not for most purposes, since Chinese already has so little morphology.

However, the comparison of # of lemmas in Chinese to # of lemmas in English is, like rezaf said, likely the most accurate way to compare English and Chinese "word" knowledge.

Posted
Well my objective is to get as close as possible to a -normal- educated native speaker in discussing serious topics like politics and philosophy

I'm curious as to why this would have to be a priority for you. Let's be honest... politics in Chinese can be bloody boring and philosophy - beyond simple catchphrases - is rarely mentioned. Most people in China are just trying to 過日子 [get by] and don't care too much for those kind of subjects (possibly due to the dubious way they are taught in their education system, but that's another topic). IMO you'd be better off mastering how to communicate - *in-depth* - about topics of everyday interest for most Chinese - getting a job, making money, getting married, living comfortably, taking care of one's family, etc. Sure, these topics look "simple" but the degree to which you can converse in them depends a lot on your level and cultural understanding. I've always found Chinese history, philosophy and culture fascinating but in my experience you're better off focusing on learning advanced concepts and vocab that come up naturally in conversation with native speakers, rather than reading and memorising the content of a myriad of different books.

Posted

First I should point out this thread is pretty old so it's possible rezaf's earlier targets may have changed.

But: being able to speak in depth about topics that interest you in a foreign language is pretty cool. If those interests are driving you forward then, even if more "useful" language knowledge about the latest TV show or describing weather or traffic or house prices is lagging behind, it'll probably catch up over time.

Of course, if you're not especially interested in politics/philosophy to begin with then of course I'd agree being able to chat in depth about the usual boring stuff is better.

There's a wider point I guess about whether part of learning a new language requires you to change your interests and effectively change your personality if you want to talk to a group of people from a different culture and background whose interests and topics of conversation revolve around areas which you previously thought were quite boring. Do you have to "dumb-down" or flatten your own interests so that they're likely to correspond to those of the vast majority of people you meet in taxis, shops, wherevers?

  • 4 years later...
Posted

Stumbled across this thread while looking for something else. My curiosity has gotten the best of me. @Rezaf, any news on your progress?

  • Like 1

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...