Jump to content
Chinese-Forums
  • Sign Up

'How many characters do you know?' program


Recommended Posts

Posted

I've written a rough bit of code to test how many characters I know. You can try it out at www.machinecat.com/chinese . (Requires Flash.)

It keeps a running estimate of how many characters you know at the top right. There is no 'end' to the test... just keep going until the number settles down to a range of about 100, which should be after 20-30 questions.

It's still far from perfect, so comments, suggestions etc are very welcome. There some explanation of how it works in the thread here.

Posted

Hey, nice program, though I don't think the questions are specific enough. E.g. in my chinese and Japanese studies I have come across a good deal of characters outside the required learning, but never actually took the time to memorize them. Some of these came up, and though I didn't 'know' them, I knew of them, so I marked yes(:wink: ) I think that if this was to be an even better program than it is now, you should increase the questions to contain more information, like: what is the pronounciation of this character, (maybe have a dialog box for it),

and maybe have another dialog box for the meaning. That should give a more accurate description of what characters you "know".

Besides all that though, I was wondering, what language was this program written in? Java right? I wish I could write something this cool...

nipponman

Posted

Actually it's a Flash movie which uses Macromedia's own scripting language, ActionScript. It's not very different from Java.

You've hit the big problem there, nipponman... how can you define 'know' a character? I spent ages wondering about that before just deciding, hey, let the user decide.

Posted

I've done a little bit of Actionscript stuff, found it fairly easy to work with, and when I figure out how to get Flash interacting with php scripts and mysql databases I'm going to be UNSTOPPABLE (although by then the internet will be obsolete and we'll all have wireless brains).

As for judging if someone 'knows' a character, it is tricky. First off you need info associated with the character - it's pronunciation / meaning / stroke order / radical / whatever. That's one issue. Then you need to generate 'wrong' answers for the multiple choice 'fillers'.

I'm hoping to come up with something similar to this soon, but word based (although I'm likely to have a character based version as well, as once you've done one . . .) as a part of my HSK vocab lists.

Roddy

Posted
Some of these came up, and though I didn't 'know' them, I knew of them, so I marked yes(:wink: )

At a minimum, you should know one pronounciation and one definition of the character to qualify as "knowing." You just have be honest with yourself with the program in its state. There are still, of course, borderline cases where you think you "know" it but are not sure.

As an upgrade, the program could reference a dictionary file like cedict and ask a user to pick between a correct choice (pinyin + definition) and some wrong ones (randomly pick some from the dictionary) .

You'll have to deal with characters with multiple pronounciations and definitions. I doubt the existing dictionary files tell you which pronounciation/definition is the most common. Maybe just randomly pick one.

Posted

That is what I am saying. There is too much room left for me to be "honest". Now, I'm an honest guy (really, I am) but sometimes when it comes to characters I might be tempted to fudge the results for my self. Its not like I'm cheating, but for example I know 乞 means to beg. But, I think the tone is the fourth tone, when it is actually the third. So, I know the meaning but not the full pronounciation. But if I *think* the tone is 4, and I *think* I know the meaning, I'm gonna say that I know the character. But, on the other hand, if there were 2 dialog boxes for these respective aspects, then I would be able to qualify my *knowledge*.

See how that works? :)

nipponman

P.s. how do you guys like my avatar?

Posted
sometimes when it comes to characters I might be tempted to fudge the results for my self. Its not like I'm cheating, but for example I know 乞 means to beg. But, I think the tone is the fourth tone, when it is actually the third. So, I know the meaning but not the full pronounciation. But if I *think* the tone is 4, and I *think* I know the meaning, I'm gonna say that I know the character.

In this case, since you know the meaning and presumably its pinyin ("qi"), but you're not sure about its tone (3rd or 4th), I would say that you can click on "yes, I know it." But if you want to be strict about it, then you'd click "yes" only if you're completely sure about the definition and the pronounciation (tone included). In any case, I think knowing at least one definition and its pinyin without the tone mark is a must to qualify as "knowing." With this level of familiarity, you should be able to pronounce the character in a word combination and have a person fluent in Chinese understand what you're saying. In a word combination, getting the tone of any single character right is often not critical.

As the program stands now, the standard is up to you, which I think is fine as an approximation for personal use.

Posted
In this case, since you know the meaning and presumably its pinyin ("qi"), but you're not sure about its tone (3rd or 4th), I would say that you can click on "yes, I know it." But if you want to be strict about it, then you'd click "yes" only if you're completely sure about the definition and the pronounciation (tone included). In any case, I think knowing at least one definition and its pinyin without the tone mark is a must to qualify as "knowing." With this level of familiarity, you should be able to pronounce the character in a word combination and have a person fluent in Chinese understand what you're saying. In a word combination, getting the tone of any single character right is often not critical.

While all this is true, according to a working definition, you still wouldn't "know" the character. Wether or not a chinese person would understand is immaterial, the question is one of knowledge, because even if a chinese person could understand you, a native speaker (or a fluent speaker, which is what we are all trying to become:) wouldn't mess up the tones.

Posted
Wether or not a chinese person would understand is immaterial, the question is one of knowledge,

??? Why not? If you know the correct pinyin, you should be able to pronounce it (assuming that you know the pinyin system) so that it would be understandable by a person fluent in the language. That's why I said you should know the pinyin (possibly without the tone mark) to qualify as "knowing" the character.

a native speaker (or a fluent speaker, which is what we are all trying to become:) wouldn't mess up the tones.

A native speaker can mess up the tone of an unfamiliar character, just like anyone else.

Posted

It seems most people are agreed that 'knowing' a character requires knowledge of both pronunciation and meaning. Here are some problems I've come across while wondering about how to test them.

1) Pronunciation. I'm reluctant to be too strict on this, as there are many characters I know and use comfortably but am not completely sure of the correct pronunciation: tones and stuff like shan/shang, kuo/huo. Also, if testing by pronunciation, people can just use the radicals to successfully guess the answer of characters they don't actually know, eg. 腫 zhong, 杋 fan, and probably most characters in fact.

2) Meaning. Multiple meanings can cause problems, but I could just ignore the less common meanings. Again meaning is difficult to test because people can just use the radicals to guess: 薯 is probably a plant or vegetable, 胧 is probably a part of the body. Names of people and places also cause problems: most Chinese can pronounce characters like 蕊 and 婷 that are common in given names, but not so many will know their original meaning. Then are other characters like 吴 and 湘 that are only used in names.

Roddy, I've been trying to figure out how to get Flash to interact with other stuff too. You can open a URL and send data by GET, but I can't see any way to receive data. Shockwave is probably better if you want to do something complicated.

Posted

You can include html pages in Flash, and those html pages can be parsed with php. Getting database information into Flash as variables, and changing this as it runs is trickier, but it seems to be possible to import XML files as variables, and those XML files could be dynamicly generated.I'll get there. Slowly. After making many rash promises.

Roddy

Posted
??? Why not? If you know the correct pinyin, you should be able to pronounce it (assuming that you know the pinyin system) so that it would be understandable by a person fluent in the language. That's why I said you should know the pinyin (possibly without the tone mark) to qualify as "knowing" the character..

But, it wouldn't be correct. Sure you could pronounce some words all willy-nilly and ascribe any tone to any word, but then you wouldn't be correct, and you definitely wouldn't know the characters that are associated with the words you're saying.

Quote:

a native speaker (or a fluent speaker, which is what we are all trying to become:) wouldn't mess up the tones.

A native speaker can mess up the tone of an unfamiliar character, just like anyone else.

But we're not talking about unfamiliar characters gato, we're talking about characters that one proclaims that they know. Which by definition cannot be unfamiliar. A native speaker will make errors (occasionally) on even common words or characters, just like in English. But also like in English 95% of the time it will be correct, if they know it. Otherwise, it's a different story.

Posted

When I was doing the test, I marked those characters as known where I knew Pinyin and tone (or, thought I knew; some confirmation here, be it interactive or not, would really help), but not necessarily meaning.

Knowing the meaning of individual characters seems to me negligible, as many characters of which I know the meaning end up, when combined with other known characters, at meanings that are beyond my understanding. So for meaning, I focus on words, not on characters (while of course, you tend to pick up the notion of a certain character if you have seen it in a couple of combinations.)

"Knowing a character" for me means being able to look up a new combination involving this character without having to waste time on the character-lookup in the dictionary...

About your planned improvements, you really shouldn't worry too much about people cheating. After all, you are designing a tool for their personal use, and not a substitute for the HSK.

And even if people cheat and get inflated numbers, a boost of morale in character learning certainly wouldn't hurt... :wink:

Posted

This is a cool test. I really like it.

As for defining what "knowing a character" is, I think the program is great as it is. As smalldog says, let the user decide. Rather than one single test, I see it as a collection of tests. It is up to me to decide whether I want to run the program clicking yes whenever I have some notion of the meaning of the character, or hit yes only if it is a character I feel I can write, or hit yes if I know the pinyin, or whatever.

Of course different criteria will give different results, but it is up to the user to decide what sort of knowledge they want to test.

Posted

Thanks for the feedback, guys. When I have the time I'll add some "check your answer" functionality whereby you can check the pronunciation and meaning before you decide whether you know the character or not.

  • 1 year later...
Posted

wow - that was good ^_^ - I've always wondered what my characters amount was - I got aroun 410 - 430 but i was usually around 418 ^_^. It was good.

I'm looking forward to more characters being put on it.

Maybe it'd be a good idea to put some pointers at the end of the score eg. what you need to do to improve your character knowledge or just basic tips for grammer adn the language etc.

Posted

It doesn't seem to be working for me. After yes on the first characters it claims i know 995 chars, then the next click goes to 967...

  • Like 1

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...