Erbse Posted May 14, 2009 at 12:25 PM Report Posted May 14, 2009 at 12:25 PM Hi guys, I'm working on a mobile flashcard software and I'd like to use the HSK vocab. They are already split up into level1 to 4, however each level has to much vocab in it. So I'd like to split these groups down further, but I'm not sure how to do that in a meaningful way. My Idea is to create chunks of 100 vocabs each. Is there any meaningful system out there, to create meaningful chunks of HSK vocab?. What do You think is the ideal number of flashcards per chunk? 20, 50, 100? Quote
chrix Posted May 14, 2009 at 12:29 PM Report Posted May 14, 2009 at 12:29 PM how about grouping the by word class? Some people like to learn all nouns together and so forth. Or you could map them with frequency data and sort by that, or just randomise them. Quote
renzhe Posted May 14, 2009 at 12:33 PM Report Posted May 14, 2009 at 12:33 PM If you already know some vocabulary, you could screen the vocabulary for things that seem familiar / logical. This is what I did. I went through each new HSK level with a text editor and copied all the vocab where I could guess at the meaning, or I had seen the word before, or I knew all the characters. Then I learned those first. It's time-intensive (took me many hours), but it can ensure that you tackle the "easy" part first and build a reasonable vocabulary quickly. If you're starting from scratch, I'd just divide each level randomly into small chunks and go from there. You'll have to learn most of those sooner or later anyway. Quote
roddy Posted May 14, 2009 at 12:38 PM Report Posted May 14, 2009 at 12:38 PM I think what would be very useful, and as far as I'm aware isn't currently available, is some kind of categorization. Kitchen Vocab: 筷子,腕 Foods; 土豆, 青菜 Sports . . . Places . . . Etc . . . Would take some work. Edit: And moving. Quote
chrix Posted May 14, 2009 at 12:57 PM Report Posted May 14, 2009 at 12:57 PM yes, it takes an enormous amount of time. There is a reason most dictionaries are sorted alphabetically. Nevertheless, it'd be a great thing to have. There is a lot of teaching material like that available in Japanese. I've got two volumes, "2000 basic words" and "2000 additional basic words". The first title is called 中国語基本単語2000. The words are ordered thematically.. Is there something similar available in English? For instance "料理を作る” (cooking) has the following: 汤,炒饭,稀饭,炒面,咸菜,烧饼,馒头,包子,糕,饺子,烧卖,油条,火腿,腊肠,切,煮,炒,煎,靠,蒸 (don't have the advanced one here right now) Quote
roddy Posted May 14, 2009 at 01:03 PM Report Posted May 14, 2009 at 01:03 PM What might work is grouping them by characters contained . . Obviously you're going to have overlap issues, and some very small / very large groups, but . . . 菜 cài 白菜 báicài 蔬菜 shūcài 菠菜 bōcài 青菜 qīngcài 菜单 càidān 芹菜 qíncài 油菜 yóucài 火车 huǒchē 火 huǒ 火柴 huǒchái 灯火 dēnghuǒ 火箭 huǒjiàn, rocket 火力 huǒlì 火焰 huǒyàn 火药 huǒyào 点火 diǎnhuǒ 发火 fāhuǒ 火山 huǒshān 火灾 huǒzāi 烈火 lièhuǒ 恼火 nǎohuǒ 怒火 nùhuǒ 炮火 pàohuǒ 可能 kěnéng, might 能 néng 能够 nénggòu 力所能及 lìsuǒnéngjí 能干 nénggàn 能力 nénglì 能源 néngyuán 才能 cáinéng, then to be able 功能 gōngnéng 技能 jìnéng 能 néng 能 néng 能歌善舞 nénggēshànwǔ 能量 néngliàng 性能 xìngnéng 本能 běnnéng 节能 jiénéng 能手 néngshǒu 太阳能 tàiyángnéng 无能为力 wúnéngwéilì 原子能 yuánzǐnéng 职能 zhínéng 只能 zhǐnéng 智能 zhìnéng, intelligent Quote
chrix Posted May 14, 2009 at 01:05 PM Report Posted May 14, 2009 at 01:05 PM that's a great idea! That way you can combine learning characters and words at the same time... Quote
Erbse Posted May 14, 2009 at 08:08 PM Author Report Posted May 14, 2009 at 08:08 PM I'm mostly looking for chunks of increasing difficulty. Most common/beginner words first and then rarer words. Similar to what You would find in a regular textbook. However I couldn't find a list that displays the HSK vocab in such manner, but I do have to admit, that I do not have a HSK study book. How is the vocab sorted in those HSK books? Are there any HSK lists that split the HSK vocab further down, than those well known 4 levels? @roddy, I've thought about this style, but I think this is a very specific use case. It definitely makes sense, but I'd like to add the most desired use cases first. @renzhe, Good idea, that's basically what I want to do, but I can only do that with the vocab I'm currently working on by myself, which excludes level 3 and 4. This is going to be a commercial(*) product, so I can't wait until I get to level 3 and 4 by myself. Last resort would be to use some simple list of how common each word is, but I'd rather like to match it with HSK-books and courses/schools that are based on the HSK vocab. @chrix Additional groups of words sorted by topic are planned for later, but I want to get the mainstream uses done first. >>The words are ordered thematically.. Is there something similar available in English? I know of such kind of book for German-English. It's damn well organized and researched. Haven't found any for Chinese though. (*) A serious discount for Chinese-Forums users helping with their comments is definitely possible. Quote
imron Posted May 15, 2009 at 12:22 AM Report Posted May 15, 2009 at 12:22 AM There might not be such lists available, but as you're a programmer it shouldn't be too difficult to take frequency data such as that found here and write a small script or program to sort the HSK level data by frequency. Although frequency is not necessarily a measure for difficulty it will arange them in order of most common to least common. Quote
jbradfor Posted May 15, 2009 at 01:40 AM Report Posted May 15, 2009 at 01:40 AM Personally, rather than breaking them down into chunks, keeping a fixed sized "learning list" seems much more useful to me. And as you learn a word it leaves the list and the next one gets placed on the list. Declan's flashcard programs does this. I'm mostly looking for chunks of increasing difficulty. Most common/beginner words first and then rarer words. Difficulty is not the same as frequency of use. Which do you mean? There might not be such lists available, but as you're a programmer it shouldn't be too difficult to take frequency data such as that found here Those lists seem to be by character (or character pairs), not words. Quote
chrix Posted May 15, 2009 at 01:56 AM Report Posted May 15, 2009 at 01:56 AM I think there were also some lists that included frequency data for words, but I might be mixing things up. @Erbse: sure there's a lot of high quality "Grundwortschatz" publications on the German market if it's about European languages. But as far as Asian languages go, there is really not that much stuff in German, it's quite disappointing. Even if you compare it with stuff written in English, I was amazed how much more high quality stuff there is in Japanese for learning other Asian languages. Especially for Indonesian, where there is not that much available in any Western language, whereas the Japanese market has a lot. For Mandarin it's slightly better for English language stuff, but I still think you can find more stuff in Japan. That's why I was wondering if there is any kind of thematically grouped basic vocabulary book available for English-speaking learners of Mandarin. The only major exception are reference grammars which are usually written in English rather than in Japanese (for Chinese, Li/Thompson or Pulleyblank come to mind) but that comes with the territory, what with English being the language of international linguistics. Quote
Guest realmayo Posted May 15, 2009 at 04:52 AM Report Posted May 15, 2009 at 04:52 AM What might work is grouping them by characters contained I'd be a bit cautious of this, when I've done something similar then I've ended up confusing a whole bunch of similar meaning words, rather than knowing them well individually, consequently using the wrong ones all the time. Quote
Erbse Posted May 15, 2009 at 08:16 AM Author Report Posted May 15, 2009 at 08:16 AM @imron, thanks for the link. I didn't know about those Bigram frequency lists until now. @jbradfor yes, that sounds interesting, yet it creates a similar problem. If I mark one word as done, which is the next word to enter my list? I still have to arrange them in some way beyond the lv1 to 4 thing. Difficulty is not the same as frequency of use. Which do you mean? Best: Sort the vocab similar the average HSK learner would expect them to be sorted. Worst: By frequency. @chrix You're right. For many language combinations thematically grouped basic vocabulary doesn't exist at the moment. An opportunity to start a business? @realmayo Lists sorted in such kind of way wouldn't be the first thing on the to do list, so don't worry Quote
chrix Posted May 15, 2009 at 08:45 AM Report Posted May 15, 2009 at 08:45 AM @Erbse: sure, I'd be game Quote
Scoobyqueen Posted May 15, 2009 at 10:45 AM Report Posted May 15, 2009 at 10:45 AM I know of such kind of book for German-English If you are looking for a German list of HSK terms, a user here called Ole has created a comprehensive list. Quote
renzhe Posted May 15, 2009 at 12:08 PM Report Posted May 15, 2009 at 12:08 PM Those lists seem to be by character (or character pairs), not words. There are frequency lists for character pairs and I believe that there are some for character triples. Most of the words in the HSK are two-character words anyway, so getting the frequency for the character pair would be a good measure. You simply ignore all the character pairs that are not in the HSK vocabulary. Leave the chengyu for last. There are also vocabulary decks for all popular Chinese textbooks -- Integrated Chinese, (New) Practical Chinese Reader, etc. You could screen those as well. Maybe calculate a score. You have the lesson in which the word appears in different textbook (lower score = easier), you have the frequency (higher = more important) and you have the HSK level. It must be possible to calculate a ranking based on these numbers. Personally, I just sat down and learned them while improvising along the way Quote
jbradfor Posted May 15, 2009 at 02:15 PM Report Posted May 15, 2009 at 02:15 PM yes, that sounds interesting, yet it creates a similar problem. If I mark one word as done, which is the next word to enter my list? I still have to arrange them in some way beyond the lv1 to 4 thing. Why do you need to arrange them? Why not just pick one? Best: Sort the vocab similar the average HSK learner would expect them to be sorted.Worst: By frequency. Well, I think you're seeing that there is no such expectation.... Quote
Erbse Posted May 15, 2009 at 08:30 PM Author Report Posted May 15, 2009 at 08:30 PM yes, that sounds interesting, yet it creates a similar problem. If I mark one word as done, which is the next word to enter my list? I still have to arrange them in some way beyond the lv1 to 4 thing. Why do you need to arrange them? Why not just pick one? Because the beginner of Chinese expects not any random word of level 1 to begin with, he wants to have words like 我 你 好 是... in the very beginning. This is also true for the ongoing learning process. The user expects a certain ordering, especially when I want to advertise the program as as "useful to accompany any HSK course". Quote
Pendragon Posted May 23, 2009 at 11:24 AM Report Posted May 23, 2009 at 11:24 AM I think I'd go for something similar to Roddy's suggestion: You could first rank the characters by the number of times they appear in the vocabulary list (so it's character frequency rather than word frequency). Then you can add the characters one by one, starting from the most common one in the list, and see which words you can create with these characters. For example: Word 1: Char A + Char B Word 2: Char A + Char C Word 3: Char B + Char C Word 4: Char A + Char D Word 5: Char C + Char D in which A is the most common character in the HSK list (and B+D doesn't exist as a word). It's true that this will result in chunks that contain lots of synonimes and similar words (which are hard to learn all at once). Maybe you could include a rule that characters with the same definition can't appear in the same chunk of words, but that may be difficult if the definitions are not in the right format (synonimes may not have precisely the same definition in the HSK list). Or for example a rule that the same character appears as often as possible in the same chunk, but with a set maximum (so you get some synomimes, but not too many). I usually learn words in chunks of 15 or 20, but maybe others prefer a different number. Quote
roddy Posted August 21, 2009 at 07:46 AM Report Posted August 21, 2009 at 07:46 AM Did anyone ever get anywhere with this? Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.