Jump to content
Chinese-Forums
  • Sign Up

HSK Online Searchable Vocabulary Database now in Beta Test


Recommended Posts

Posted

Level 3 probable errors (correct/incorrect)

必需/必须

不比/不必

不像话/不象话

裁缝/采风

出息/出席

得意/得以

电铃/电令

火力/活力

客堂/课堂

搂/楼

鹿/露

气压/欺压

塞/赛

事务/事物

田地/天地

陷/馅

行驶/行使

绣/锈

学员/学院

元宵/远销

怨/苑

俄语 pronunciation spelling

Also noticed a couple more in level 1

饿/俄

散步/散布

  • 2 weeks later...
Posted

Big thanks to 楚留香 for some eagle-eyed proofreading. The only one I didn't change were

桨/奖

客堂/课堂 and

鹿/露

as these appeared to be right already, according to the book I was working off. I have however found a more recently published book which has some extra / missing words. I'll make those extra changes soon, I hope.

At the moment I haven't yet updated the online database, so don't go looking for those changes - they aren't there yet, only on my local development copy. I hope to put in a more efficent searching mechanism soon, which will allow more choice in what is displayed and also how it is sorted, and the updated database will go on at the same time. Suggestions, as always, welcome. Especially simple ones.

Roddy

Posted
Level 3 probable errors (correct/incorrect)

必需/必须

According to 現代漢語詞典 -

須1 xū (1) 須要:務須|必須|須知|事前須做好準備。(2) (Xū) 姓。

需 xū (1) 需要:需求|按需分配。(2) 需用的東西:軍需。

So it is not incorrect. I think 需 and 須 differ mainly in usage, in that 需 is followed by nouns (我需要一幢房子) and it can be used as a noun itself (按需要分配房子), whereas 須 is followed by verbs (itself being auxiliary) (我必須買一幢房子).

Posted

Both 必需 and 必须 are words in the HSK. But according to my HSK book, the level 3 word should be 必需, not 必须. 必须 is a word in level 1 only.

BTW, I looked through the level 4 word list and found 0 errors. It's the largest list; I thought I'd find at least one typo. The only minor difference was my list also has 担 dan4 as a level 4 word.

Roddy, you're right, 客堂/课堂, 鹿/露 and 桨/奖 were not mistakes. I'm not sure how those got in there.

Posted

Probably as there was more stuff I didn't know in the 4th list - I was going a lot more slowly. With the others I probably was a bit over-confident.

Roddy

Posted

Right, assuming the stuff I'm doing right now works . . .

You now have a shiny new search engine to play with. I won't give you the details, as it should be fairly explanatory. Suggestions on other useful features are very welcome.

HERE

Roddy

Posted

Roddy, this is absolutely fantastic. :clap I can't imagine how much work must have gone into that, but thanks. I might even have to take the HSK again now, this will make it so much easier to prepare. :D

Being able to output by tone conbinations is a great idea by the way, I hadn't even thought of that. And the way you have set it up makes it really easy to output to a supermemo flashcard database as well, which is perfect. I can't wait for the english translations now. :D

Posted
the way you have set it up makes it really easy to output to a supermemo flashcard database

That was clever of me, as I have no idea at all what a supermemo flashcard database is.

As for English translations - someday, someday - but not necessarily soon.

Currently working on .pdf output, but its proving trickier than I thought.

Roddy

  • 4 weeks later...
Posted
Currently working on .pdf output, but its proving trickier than I thought.

跟着我说 ........ arrrrrrrrrrrrrrrrrrrrrrrrrrrrrgggggggggggggggh!!!!!! :wall

Posted

Some more pronunciation corrections. I've checked them against Xiandai Hanyu Cidian, so any errors in the corrections are probably due to my own typos :-) so check against the actual characters if the corrections have mistakes :-? Note: In addition to pinyin corrections, I've also listed entries with underscores in the pinyin but not in the characters, and cases where it seems v hasn't been converted properly to ü - some entries in the database seem to use ü and some as v, and I figure ü is preferable.

短 should be duan3 not duan4

飞 should be fei1 not fei

附近 should be fu4jin4 not fu3jin4

挂 should be gua4 not guan1

国 should be guo2 not guo4

国家 should be guo2jia1 not guo4jia1

还 should be huan2 not huan4

回去 should be hui2qu4 not hui2lai4

火车 should be huo3che1 not huo3che4

家庭 should be jia1ting2 not jia4ting2

jiang1lai2 is missing the 来 character

街 should be jie1 not hie1

开学 should be kai1xue2 not kao1xue2

历史 should be li4shi3 not li2shi3

名字 should be ming2zi not ming1zi

拿 should be na2 not na4

努力 should be nu3li4 not nü3li4

热 should be re4 not re2

热情 should be re4qing2 not re2qing2

认为 should be ren4wei2 not ren2wei

认真 should be ren4zhen1 not ren2zhen1

上午 should be shang4wu3 not shang4qu4

实践 should be shi2jian4 not shi4jian4

送 should be song4 not xong4

香 should be xiang1 not xiang1xin4

响 should be xiang3 not xiang4

阿姨 should be a1yi2 not ai1yi2

办事 should be ban4shi4 not ban4shi1

XDHYCD lists 叉 as having the pronunciation cha1 cha2 and cha3 but lists 叉子 under cha1zi not cha3zi

成长 should be cheng2zhang3 not cheng2chang3

翅膀 should be chi4bang3 not chi4ban3

大多数 should be da4duo1shu4 not da4duo3shu4

当前 should be dang1qian2 not dang1shi2

当时 should be dang1shi2 not dang3

到处 should be dao4chu4 not dao3chu4

低下 should be di1xia4 not di3xia4

电梯 should be dian4ti1 not dain4ti1

对比 should be dui4bi3 not dui3bi3

鹅 should be e2 not e4

放心 should be fang4xin1 not fang4xin4

分别 should be fen1bie2 not fei1bie2

改进 should be gai3jin4 not gai3jiin4

高度 should be gao1du4 not gan1du4

略 should be lüe4 not lve4 (I know v is commonly used for ü but other words in the database use ü so I thought I should mention it)

侵略 same problem with lve4

试验 should be shi4yan4 not shi2yan4

先后 has an underscore in the pinyin

儿女 also has an underscore in the pinyin

掠夺 has the same problem above with lve4

上下 has an underscore in the pinyin

少先队 also has underscores in the pinyin

是非 also has an underscore in the pinyin

是否 also has an underscore in the pinyin

战略 same problem with lve4

才智 also has an underscore in the pinyin

财经 also has an underscore in the pinyin

策略 same problem with lve4

出入 also has an underscore in the pinyin

废物 should be fei4wu4 not fei1wu4

闺女 should be gui1nü not gui1nv

好坏 also has an underscore in the pinyin

忽略 same problem with lve4

来看 also has underscores in the pinyin

利弊 also has an underscore in the pinyin

略微 same problem with lve4

省略 same problem with lve4

收支 also has an underscore in the pinyin

往返 has an underscore

问答 has an underscore

邮电 has an underscore

装卸 has an underscore

Posted

:(

Thanks. Don't know when I'll get time to do anything about it though.

Roddy

Posted
I've checked them against Xiandai Hanyu Cidian

a dictionary is intended to impose "preferred spellings" and "standards" ... it's not law...

thank you for looking these over, but the fact that Roddy is not using notation of the dictionary is not an indication of error...

Posted

Not necessarily... the more chinese learners and professors i talk to you, the more i discover that although there is a correct way to speak, there is no correct way yet to romanize...

is HSK grading subjective? or is there a strict rubric?

Posted

Actually, I believe the HSK solely uses pinyin for romanisation - probably because it was developed in Mainland China, and if you are talking about modern Mandarin Chinese as spoken on Mainland China (i.e what the HSK is testing) then there really is only one way to romanise characters. This is not to say that other systems don't exist or aren't valid, just that an "HSK Word List" should use the same system that you'll be tested on when taking the HSK.

Also, I should point out that several of the corrections were for obvious typos e.g. xong1, jiin4 and hie1 - which are invalid pinyin, and also things like 香 which had two words for it's pronunciation despite there only being one character. Things that would be classified as errors regardless of what system you use.

Anyway, I didn't post the list as a criticism of the database (which I think is a magnificent effort and worthy of much praise), I posted it to help improve what is surely going to become a very valuable online resource for people studying Chinese.

Speaking of which, Roddy, I was thinking about writing a small program to run the HSK word list against something like Cedict, and then generate a new word list with English meanings (saves typing them all in by hand!). If you're interested in something like this, let me know. I could knock it up pretty quickly as I'm currently writing a Chinese reader program, and most of the code for doing something like that is there already.

Posted
Anyway, I didn't post the list as a criticism of the database (which I think is a magnificent effort and worthy of much praise), I posted it to help improve what is surely going to become a very valuable online resource for people studying Chinese.

well... give roddy a few days. he's a little slow. :mrgreen:

*geek_frappa bans himself...

Posted

Right, that list of corrections from imron has now gone in, with the following caveats . . .

1) I missed the bangongshi one, as I was concentrating on the long list.

2) the underscores in some places are meant to indicate that this is a conjunction and you need something to go there - ie ruguo_dehua. However, this hasn't been very consistently done.

3) The Word Macro I use doesn't always convert V to U+umlaut - I've left this though, as I daresay you are wise enough to figure it out. If you are doing a pinyin search, use 'v' though - the pinyin searches are done on the 'unmacroed' column, so use 'v's and numbers.

Many thanks for the help - if anyone loses any needles in haystacks, I heartily recommend they employ imron . . .

Roddy

  • 3 weeks later...

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...