Jump to content
Chinese-Forums
  • Sign Up

Common readings list (technical problems)


Recommended Posts

Posted

Hi, 

I have been trying to make a solution for my problem for a few months now but I just can't do it. I lack the technical skills to do so. I want to start out by mentioning I have posted a shorter question on Pleco-Forums a few days ago, but this is a much more detailed account of what I'm trying to achieve, and I hope there might be some tech wiz out there whou knows how to achieve this. I use Anki and have been doing so for ages, and since a couple of months I have three decks, which is working fine for me.

 

1. VOCAB deck

2. Character deck (recognition)

3. Character deck (recall, write)

The first and second decks are working great, no techincal issues whatsoever. For deck #3 I just use a Heisig deck. For deck #1 I am using Chinese Text Analyser to import say, a book, then export the vocabulary into Pleco. When I import to Pleco I use the Xiandai Guifan because it's (1) authorotative (2) has a good amount of vocab, which is not too large. A too large vocabulary would result in too many words to know ahead of every book, and a lot of words I don't need. For example huo3che1 che1xiang1is considered a word by KEY but it would just be a waste of time to learn bouth huoche, chexiang and huoche chexiang in Anki. After I've imported the words I change all to the KEY entries. I then export both traditional and simplified and make cards with Traditional on front, with meaning and Simplified on the back.

The cards I would like would look like this:

 

EXAMPLE #1:

FRONT: 

BACK: 

liù

num | sn 1 six 2 {music} Liu (a note on the Chinese musical scale gōngchěpǔ 工尺譜/工尺谱) 3 Liu (surname) 

Simplified: 

 

EXAMPLE #2:

FRONT:

BACK:

1. zhōng 2. zhōng

(1)

n  | adj  | suf  | sn 1 centre, middle 2 in, in the middle of, amid, among 3 medium, intermediate 4 midsize 5 intermediary 6 fit for, suitable for, good for 7 [ZH-] Chinese, Sino- 8 the second (in a series of three) 9 {colloquial} all right, okay 10 in the process of, in the course of 11 Zhong (surname)
(2)

v 1 hit (a target) 2 attain (a goal) 3 be just right, fit exactly 4 win (a prize, lottery, etc. by chance) 5 be affected by 6 suffer from (e.g. an illness, a stroke) 7 fall into, sustain

Simplified: 敦

 

In other words: I want common readings, but not only *the most* common reading. The most common reading has been quite easily accomplished thanks to the kMandarin Unihan. kMandarin would work in the 六 example, since lu4 could be considered somewhat rare, certainly in the meaning of what I'm most likely to encounter almost every time I see the character. However, for 中, zhong1 is obviously the most common, but zhong4 is common as well, hence I need both. I'm taking really simple characters as examples to make the case clear.

 

Now, I think I would be able to solve this if I was learning simplified characters, thanks to the somewhat new Unihan entry kTGHZ2013, which gathers pinyin information from "One or more Hànyǔ Pīnyīn readings as given in Tōngyòng Guīfàn Hànzì Zìdiǎn (full bibliographic information below)."

 

However, since I'm learning traditional characters, I haven't been able to match them with the most common readings. The Tongyong Gufan Hanzi Zidian also have traditional standards but I haven't been able to get this right via Pleco now matter how hard I try.

 

I have been using the Taiwanese school grade reference in the order I've been learning the characters, mostly because it has been good checking if I've missed some of the 2000 that were in the earlier grades (there were a few). Please note that I am fully aware of traditional variants being different on the mainland and in Taiwan. The only reason I picked the Taiwan school grade list was because it was easily grabbable and has about 5000 characters. Most of these are of course going to be the same as on the Mainland when it comes to form, but I realize some will differ. Which one I learn is not hugely important at the moment, and since I'm planning on studying Chinese history I'm going to need both anyway. Since KEY is the dictionary I use because it's the only character dictionary I have been able to find (ABC has several "meaningful bound form" entries, which doesn't provied enough information).

 

So, in short, this is the process I would want to be able to achieve:

 

1. Have e-book, which most often are in simplified Chinese. ✔️

2. Parse all Hanzi via a script, to get a frequency list. ✔️

3. Turn this into a sheet with following information: 

(1) Simplified character

(2) Traditional character (if more than one, both are needed. If I enter 只, I want at least 只 and 祇, not just one of them. If I only got one of them what would leave one of my Anki cards empty when importing. If I would get incredibly archaic variants as well, it would not be a problem since they would get filtered out by the KEY dictionary, or they would not match my Taiwanese school grade deck. The few that would not match because of other reasons I could check manually in just a few minutes, I think.

(3) Common readings.

(4) Meanings (this I *should* be able to perform via Pleco, by matching the simplified[traditional] pinyin. When I only have one of the sets, it means a character like 后 or 後 would only fill one card, where I would need both.

 

Sorry for a long and probably incoherent thread, but I don't know how to better describe my problem. All help would be immensely welcome.

 

ONE LAST THING:

Sicne the Simplified to Traditional thing might be one of the tougher problems mentioned, I should point out that even without that a solution would be welcome. For example: being able to export my current Taiwanese School Grade deck into Pleco with only the common readings. When I import *all* readings, 敢 provides me with diāo, duī, duì, dūn, dùn, tuán, which is just not a feasible solution.

 

If there simply is no solution to this problem, I would welcome recommendations on how to go about it. I would not want to leave Anki in itself I think, unless the alternative is vastly better. One problem with Pleco flashcards (I think?) is that the only offer one entry per card, which would mean a huge workload, compared to just ONE card per character.

 

EDIT: Attached is my current Anki card design, what you see is the back of the card.

ankicard.png

Posted

I am not sure I understand all your questions/desires. :)

 

One question I have is - are you using the free version of Pleco? If so I strongly suggest  you invest in the basic package (with a view to later adding things you learn you want/need) becuase the paid for version of the flashcards is very powerful and allows a lot of customisation. There are a lot of dictionaries to choose from (again some paid).

 

I have mine set up very simply, mainly becuase I don't need any more but I have both simplified and traditional character sets in my dictionary and flashcards. This requires no more than just choosing the sets I wanted globally in the settings.

 

Creating flashcards with Pleco is so easy but can be as comprehensive as you want to make it. Audio, different characters sets, stroke order, various dictionaries and so much more is available with almost just one tap.

 

One thing is that Pleco is specifically and only for Chinese whereas Anki is a general purpose flashcard system which from my limited experience makes it a little more tedious but it has got better I understand.

 

If you are using the paid version then I would spend more time exploring the settings and choices available as it is a very powerful program.

 

I say all this becuase the paid version of Pleco is such a powerful and useful program I feel it should be able to do all you want :)

Posted

Thank you for an elaborate answer.

 

I do indeed use the paid version of Pleco, and have the professional bundle plus some more things like the KEY dictionary, native audio and so on.

 

My main arguments for myself using Anki are the following:

 

1. Scheduling. I know the program by heart and I have extensions that allows for a large workload by spreading out the reviews in an order that makes it more bearable, but *without* going outside the algoritm days to keep retention high. 20 new cards per day is not a problem, and in my vocab deck I sometimes go as high as 60 a day without reaching a burnout level. The only way I can do this without hesitation is that I can always see a forecast of where I'm heading.

 

2. The most important. Seveal dictionary entries on the same flaschard. This makes a slight difference when it comes to vocabulary, but a *huge* difference when it comes to characters. Most vocabulary only have one entry, but some have more than one (especially true for beginner words), and this would not affect the total work balance a lot to have separate cards for them. For characters however, an entry for 龜 would in Anki result in one card with the character on front, and three readings (guī,qiū,jūn) and meanings on the back. In Pleco however, if I haven't misunderstood anything, to get the same result I would need three separate flashcards, one for each entry/reading. Accumulated this means a couple of thousand cards in Anki turns into several thousands in Pleco. In other words: months turn into years.

 

Since I started this post I have been doing the whole thing manually, importing each character and checking all matching readings in KEY in Pleco. It's taking quite a while but it really seems I will spend less hours doing that than I have spent looking for an automated solution (oh, the irony). Note that this still, unfortunately, means I get *all* readings in the dictionary, which for KEY means a lot of them. KEY have way more entries for characters than most dictionaries, including the ABC (as ABC has the Taiwanese Mandarin readings as well, which I don't want). The only readings KEY seems to be missing are the extremely rare/archaic ones that Da Hanyu Cidian has (like jiē for 差). This is working *unexpectedly* well and I am able to retain most readings of most of the 2000 characters I have under my belt (i know more characters but I'm only counting the ones I have in my deck). There are of course exceptions that are really hard to retain, and I think 敦 might be the worst example in the end, with 6 separate entries. Aforementioned 差 has 5 entries but for a non-beginner I would say at least 3 of them are commonly encountered (cha1, cha4 and chai1) and another one somewhat commonly encountered (ci1), which means that even if it has a lot of entries, they're easy to remember.

 

I have so far gone through, manually, the characters for Taiwan grade 3, 4, 5, 6 and 7 and today I will certainly be able to finish grade 1 and 2 (which naturally have fewer characters).

 

I was at first going to use the HK grades since they are easly attainable through Unihan, but I noticed that not very many of them (about 2000 if I remember correctly, or a few more maybe). I also must ask, how is a character like 着 used in modern Traditional Chinese (doesn't have to be Taiwan)? 著 seems to have taken for most of the alternatives, but not all. Anyone knows? As I mentioned, I need to know both in the end since I'm going to read older material as well, but I'm curious. I also wonder ift the character 岩 is used at all in modern Traditional novels. Most dictionaries, including KEY, seem to regard it as a purely simplified character, even if the MOE dictionaries have an entry for it. Another Mainland/Taiwan difference?

  • Like 1

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...