Ed Log Posted March 27, 2009 at 02:57 PM Report Posted March 27, 2009 at 02:57 PM Does anyone have a comprehensive list of the most common spoken words in Chinese. There are loads of lists of most common words but these are all for the written language as far as I can tell. Again looking for a ranked list of most common SPOKEN words. Thanks Quote
Hofmann Posted March 28, 2009 at 05:52 AM Report Posted March 28, 2009 at 05:52 AM Data for such a list would be difficult to get. Why would you find that character frequencies for spoken Mandarin and Vernacular Chinese differ significantly? Quote
imron Posted March 28, 2009 at 06:15 AM Report Posted March 28, 2009 at 06:15 AM The Junda character frequency statistics provide separate statistics for imaginative texts compared to informative texts. I imagine there would be a significant difference again for spoken usage. Quote
roddy Posted March 28, 2009 at 06:19 AM Report Posted March 28, 2009 at 06:19 AM OP is looking for words, not characters. Quote
ChouDoufu Posted March 28, 2009 at 08:43 AM Report Posted March 28, 2009 at 08:43 AM I've been looking for similar data recently and haven't found anything yet. Such data would need to be based off of a spoken Chinese Corpus. But I haven't found such a corpus. Most spoken corpora are based off of news transcripts, unscripted interview shows, and occasionally recorded family conversations and business meetings. I don't know if such data has been compiled in a systematic way for the Chinese language. I believe LDC has some Chinese corpora based on news show transcripts, but that's not a cheap solution. If you find something, let me know. Quote
roddy Posted March 28, 2009 at 03:01 PM Report Posted March 28, 2009 at 03:01 PM If anyone wants to try and hunt something down, there are some good links to start with here. Quote
victorhart Posted September 22, 2018 at 08:17 PM Report Posted September 22, 2018 at 08:17 PM My serious answer would be to search for existing word corpora, and, if no satisfactory ones exist, find someone with up-to-date basic data analysis skills, gather a bunch of transcripts, and make your own. I myself will do this at some point, but right now, I'm studying Mandarin with a funky method, rather than thinking seriously about an ideal learning approach (which might involve word and character frequency analysis). My tongue-in-cheek answer, however, is: The Power of Hao 好 Over appetizers of hummus and baba ghanoush in Riyadh, I asked two Chinese colleagues, “What’s the most frequently used word in Mandarin?” They were uncertain but ventured a couple of guesses. “No,” I disagreed, “but I know what it is.” I’ve never looked at a Mandarin word corpus, nor have I ever researched the question at all. I’ve never even done a Google search. Nonetheless, I brashly affirm: The most frequently used word in Mandarin is: HAO 好 Read more . . . Quote
Shelley Posted September 23, 2018 at 12:58 AM Report Posted September 23, 2018 at 12:58 AM I wonder if anyone is still interested after nearly 10 years, wonder if the words change with time. It wouldn't surprise me all languages evolve. Quote
大块头 Posted September 23, 2018 at 04:14 PM Report Posted September 23, 2018 at 04:14 PM 19 hours ago, victorhart said: My serious answer would be to search for existing word corpora, and, if no satisfactory ones exist, find someone with up-to-date basic data analysis skills, gather a bunch of transcripts, and make your own. This was done about a year after this thread started by researchers analyzing movie and TV subtitles. http://crr.ugent.be/programs-data/subtitle-frequencies/subtlex-ch 1 Quote
Shelley Posted September 23, 2018 at 05:11 PM Report Posted September 23, 2018 at 05:11 PM Having time to think about it I would say 很 is up there near the top. Quote
DavyJonesLocker Posted September 24, 2018 at 06:08 AM Report Posted September 24, 2018 at 06:08 AM These lists by K. J. Chen and the CKIP Group of the Academia Sinica may help thread is nearly ten years old but still a popular question I'd imagine 1 Quote
imron Posted September 24, 2018 at 06:38 AM Report Posted September 24, 2018 at 06:38 AM On 9/23/2018 at 4:17 AM, victorhart said: The most frequently used word in Mandarin is: HAO 好 13 hours ago, Shelley said: Having time to think about it I would say 很 is up there near the top. Well according to both the character and word frequencies of Subtlex, it's neither of these. 好 just scrapes in to the top 10 by word and top 15 by characters, and 很 makes it in to the top 25 by word and top 25 by character. Here are the top 10 rankings first by word: 的 我 你 是 了 不 在 他 我们 好 And now by character: 我 的 你 是 了 不 们 这 一 他 1 Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.