renzhe Posted September 15, 2010 at 11:11 AM Report Posted September 15, 2010 at 11:11 AM I only cited estimates (based on dictionary sampling) provided by some renowned linguists, as referenced by another renowned linguist. Any such study will begin by warning you about the problem of defining what a "word" is, and how you obtain the number. I believe that most of them ignored the declensions and conjugations of a word, but included all the different words from a family. Therefore, they are not counting lemmata, they are counting words. This can easily account for the factor-of-three difference. A good check whether something is a word is to see if it has a separate heading in a dictionary. sharp, sharpen, sharpener, etc, are all different words with different meanings. Luckily for us, they are pretty easy to infer if you know the root and the suffixes. Any sufficiently educated native speaker will know well over 10,000 of those. A sufficiently educated Chinese person will have a passive knowledge of 4-5,000 characters alone, and that doesn't even start with the multi-character words which are the majority. Based on counting, I should have a (passive) vocabulary of over 10,000 words / vocabulary items in Chinese, and I assure you that this is nowhere NEAR enough to understand just about everything on just about anything. It's not even close. I have collected well over a thousand non-trivial words by watching Fendou alone. Quote
rezaf Posted September 15, 2010 at 12:11 PM Author Report Posted September 15, 2010 at 12:11 PM Memorizing the dictionary is almost just as useless, you know. Actually it is the fastest and the most accurate way of learning a language but it needs P.A...T..........IE.... :unsure: ...........N.............. :wacko: .............CE 1 Quote
xaze Posted September 15, 2010 at 09:08 PM Report Posted September 15, 2010 at 09:08 PM However, is any of this relevant to Chinese? Is there any reason to think that "word families" exist in Chinese and, if they do, that they correspond usefully with the amount and scope of word families in English? It seems to me that in Chinese you either descibe these "families" very loosely, eg anything with 友 in it belongs to the same family (which isn't so helpful because knowing 友 won't help you much with 朋 or 谊); or very tightly, limited to I guess certain suffixes such as 们、性、化 etc (朋友们、多样性、城市化). In Chinese every word is a lemma. So, by comparing the number of lemmas an educated English speaker knows, then it is probably comparable to the number of lemmas someone knows in Chinese. This works quite well with languages such as spanish and english, but I have a suspicion that it overestimates the number of chinese words you would have to know. Because chinese uses roots to form new "words" I think it cuts down on the actual number of words you need to know. English barely does this. For example, if you know extra and you know ordinary, you can probably guess the meaning of extraordinary. So, there are many cases in English where you would need to know 3 words, but Chinese only requires knowing 2 and the 3rd could be deduced with good accuracy. Quote
renzhe Posted September 15, 2010 at 09:43 PM Report Posted September 15, 2010 at 09:43 PM English barely does this. I wouldn't be so sure. I have opened my Collins Compact at a random page, to find these words: firearm fireball firebomb firebrand firebreak firebrick firecracker firedamp firedogs fire-eater fire-extinguisher firefighter firefly fireguard fireman fireplace fireplug firepower firedide firetrap firewall firewater firework fireworks And that's just starting with fire. Then there are compound vocabulary items which have a clearly defined meaning, but are written as separate words: fire alarm fire brigade fire clay fire door fire drill fire engine fire escape fire hall fire hydrant fire irons fire raiser fire ship These would typically be seen as two words in English, but one 词 in Chinese. So, there are many cases in English where you would need to know 3 words, but Chinese only requires knowing 2 and the 3rd could be deduced with good accuracy. In my experience, this accuracy is very overrated. You need a lot of context, your best guess is often wrong, and you typically need to see it written, which introduces a lot of extra information, in terms of radicals, etc. Best of all, you don't even know if it's a verb, a noun, an adjective or something else. Quote
Guest realmayo Posted September 16, 2010 at 12:30 AM Report Posted September 16, 2010 at 12:30 AM In Chinese every word is a lemma Really? In that case I would have thought Chinese has far more "lemmas", for instance you'd have to include 吃饭, 吃苦,吃亏,吃惊,吃力,好吃,省吃俭用,吃醋,吃素 and this is just copying the top few from the Wenlin dictionary. And also my earlier examples 城市化 etc. Quote
rezaf Posted September 16, 2010 at 12:23 PM Author Report Posted September 16, 2010 at 12:23 PM OK, I will complete my list in between 5 to 7 years depending on the quantity of the words(if I'm still alive :rolleyes: ). It's gonna be almost all the common words a Chinese knows plus some chengyu, literary and medical words(just some cuz I don't want lots of medical words on this list) and then we will have an idea about the number. Quote
xaze Posted September 20, 2010 at 06:48 AM Report Posted September 20, 2010 at 06:48 AM Really? In that case I would have thought Chinese has far more "lemmas", for instance you'd have to include 吃饭, 吃苦,吃亏,吃惊,吃力,好吃,省吃俭用,吃醋,吃素 and this is just copying the top few from the Wenlin dictionary. And also my earlier examples 城市化 etc. Not necessarily because the difference between a lemma and one of its word forms is not in a change of lexical meaning, but instead it provides grammatical information. For instance, Spanish has nearly 50 forms of every verb. So, a native may know the verb correr (the lemma) and also know 50 other forms of the same word. English does this much less, so we might have 3 forms of the same verb. However, in order to accomplish the same thing Spanish does with 100,000+ different word forms we use only handful of auxiliary words. I am not a master at Chinese. I am just preparing the system for me to learn Chinese. I would assume, as it would be extremely inefficient otherwise, that Chinese uses a relatively few number of auxiliary words to accomplish the same thing that English does using conjugations. For example, Chinese probably has a few words to indicate tense (past, present, and future) or mood (imperative, subjunctive, or indicative), etc. I doubt that Chinese could have far more lemmas because unlike "word forms," which are based on a logical system, lemmas require far more memory and there would be diminishing returns for every lemma learned. So, a language can only have so many lemmas to be at a proficient level before the language ability of the dumbest few % would be pushed to the max. That is my opinion. Quote
Guest realmayo Posted September 20, 2010 at 06:59 AM Report Posted September 20, 2010 at 06:59 AM oh, so in English "urban" and "urbanisation" are two lemmas? I think I understand. on a side-note, I wonder if it's actually at all useful to apply the concept of "lemmas" to Chinese. Quote
xaze Posted September 20, 2010 at 07:03 AM Report Posted September 20, 2010 at 07:03 AM In my experience, this accuracy is very overrated. You need a lot of context, your best guess is often wrong, and you typically need to see it written, which introduces a lot of extra information, in terms of radicals, etc.Best of all, you don't even know if it's a verb, a noun, an adjective or something else. Based on my limited analysis, I think a large percentage of chinese compounds are not obvious in the first few thousand words, but as the compounds are less frequent they become more obvious. For example, 年产 (9765), 有望 (9766), 暂停 (9767), 条文 (9768), 无情, (9769). The number next to them represents frequency. I think these words would not be too difficult to figure out based on context and part of speech should be at least hinted at depending on the position in the sentence. More importantly, there are a myriad of words like these that may not be close enough to the components to provide an exact definition, but ARE close enough to provide a mnemonic device for memory, which means the effort to remember them is far easier just like the effort to remember football is far easier than pusillanimous. That is my experience at least. I guess everyone's brain works differently. 1 Quote
Guest realmayo Posted July 12, 2012 at 09:44 AM Report Posted July 12, 2012 at 09:44 AM rezaf, how's this project going? Quote
陳德聰 Posted July 12, 2012 at 09:53 AM Report Posted July 12, 2012 at 09:53 AM Hm. Your side-note about lemmas being useful in application to Chinese... likely not for most purposes, since Chinese already has so little morphology. However, the comparison of # of lemmas in Chinese to # of lemmas in English is, like rezaf said, likely the most accurate way to compare English and Chinese "word" knowledge. Quote
tooironic Posted July 12, 2012 at 11:00 PM Report Posted July 12, 2012 at 11:00 PM Well my objective is to get as close as possible to a -normal- educated native speaker in discussing serious topics like politics and philosophy I'm curious as to why this would have to be a priority for you. Let's be honest... politics in Chinese can be bloody boring and philosophy - beyond simple catchphrases - is rarely mentioned. Most people in China are just trying to 過日子 [get by] and don't care too much for those kind of subjects (possibly due to the dubious way they are taught in their education system, but that's another topic). IMO you'd be better off mastering how to communicate - *in-depth* - about topics of everyday interest for most Chinese - getting a job, making money, getting married, living comfortably, taking care of one's family, etc. Sure, these topics look "simple" but the degree to which you can converse in them depends a lot on your level and cultural understanding. I've always found Chinese history, philosophy and culture fascinating but in my experience you're better off focusing on learning advanced concepts and vocab that come up naturally in conversation with native speakers, rather than reading and memorising the content of a myriad of different books. Quote
Guest realmayo Posted July 13, 2012 at 08:31 AM Report Posted July 13, 2012 at 08:31 AM First I should point out this thread is pretty old so it's possible rezaf's earlier targets may have changed. But: being able to speak in depth about topics that interest you in a foreign language is pretty cool. If those interests are driving you forward then, even if more "useful" language knowledge about the latest TV show or describing weather or traffic or house prices is lagging behind, it'll probably catch up over time. Of course, if you're not especially interested in politics/philosophy to begin with then of course I'd agree being able to chat in depth about the usual boring stuff is better. There's a wider point I guess about whether part of learning a new language requires you to change your interests and effectively change your personality if you want to talk to a group of people from a different culture and background whose interests and topics of conversation revolve around areas which you previously thought were quite boring. Do you have to "dumb-down" or flatten your own interests so that they're likely to correspond to those of the vast majority of people you meet in taxis, shops, wherevers? Quote
LiMo Posted July 22, 2016 at 11:03 PM Report Posted July 22, 2016 at 11:03 PM Stumbled across this thread while looking for something else. My curiosity has gotten the best of me. @Rezaf, any news on your progress? 1 Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.