Jump to content
Chinese-Forums
  • Sign Up

Universally accessible HSK word list -question


Recommended Posts

Posted

I am planning, should time permit, to create a non-pdf digital word list for the HSK.

The format I was considering is

simplified/traditional/measureword/pinyin/englishtranslation/

Here are my questions/concerns:

I am an apple user, and sometimes files get funny when switched between platforms. How can I avoid this?

What is the best application to use to make the files accessible for use on palms, to print flashcards, etc. I was thinking spread sheet.

Please let me know what you think, and any other suggestions you may have.

Thanks!

Posted

I think it is a very good idea ,

and I´d like to exchange ideas and resources with you.

I almost have edited a "HSK A " Word list. (甲级词)

GB: O.k

B5: O.K

Eng.: O. K Unihan

German (not yet)

Wordtype: (not yet)

For editing I worked with Wenlin and Microsoft Access.

I have been using the official " 中国汉语水平考试大级 基礎 "

(The French Pdf listing uses a different ordering

system !)

As I am studying traditional Chinese by myself , I intend to "translate" the HSK GB materials to B5 and to generate study aids .

A Palm application for HSK would be great !

Hopefully there a many users joining in !

Malheureusement , I am no programmer !

ciao,

Ole

Posted

The Acrobat format(PDF) was devised to address exactly the same problem you're mentioning: to create and distribute documents that can be read on every platform, that can be printed on any printer. Adobe has spent a lot of effort to make Acrobat Reader free; and there are non-Adobe free PDF readers and creation software as well.

PDF is the ideal format for what you want to do, not GB or Unicode text files. A lot of platforms that can easily open PDF documents do not have support for Unicode, or have Chinese characters installed.

If what you wanted to do is to distribute a document that can be edited, then go with text files. Please prepare your word list in PDF, however: More people will be able to benefit them. If you don't have a means of preparing PDF files, I can help.

Thanks for the pointer to the HSK word lists, by the way.

Posted

Would it be wise to have a database which has the basic info (characters, pinyin, word-type (verb, noun, etc), translation in 65 global language) and can then be used to produce material - worksheets, flashcards, lists - in what ever format is wanted?

This is the kind of project I'd be interested in supporting on here - either just through bring interested people together, webspace, time and effort . . .

Roddy

PS Don't rely on that time and effort . . .

Posted

If you're looking for a Chinese-English database and want something extensible, why not consider working with the Adso database? This is an edited superset of the CEDICT project containing about 112,000 words/phrases. It holds multiple definitions of common words and phrases and contains pinyin for many of them. It also breaks words down into a large number of categories:

Addresses, Adjectives, Adverbs, Auxiliary Verbs, Chengyu, Cities, Conjunctions, Countries, Measure Words, Names, Nouns, Non-Chinese Words, Numbers, Organizations, Phrases, Places, Prepositions, Pronouns, Punctuation Marks, Times, and Verbs, etc..

There are a number of advantages to doing things this way over storing them in a text file. The biggest is that most of the words you'll want to include in your HSK list already exist -- you'd only need to tag existing words as appropriate for various HSK levels rather than re-entering all of the data from scratch. Another advantage of using a database over a text file is that future users/developers won't end up locked-down to any one file format. It is easy to comb through the database and create custom text files formatted for use by specific programs.

If you're interested, it would be trivial for me to alter the existing database to support the kind of data you and Roddy mention above. If someone has a public server with MySQL and Perl installed (Roddy?), we could even put the thing online for collaborative editing. I imagine this could be up and running in about a day. The tools are certainly all in place.

Feel free to get in touch if you have any questions.

Cheers,

--dave

Posted

I have MySQL and Perl, though I'd need to get another database added to my hosting account ( only get one with the standard set-up, and while i could add the tables to the forums database, I'd rather not).

I've been thinking along the database lines myself - something like a MySQL database with PHP scripts (which is what I'm used to from the forums) giving tailored output.

Roddy

  • 1 month later...
Posted

I'm planning to try and make some progress on this over May. If anyone has any useful files they are willing to contribute, let me know.

Roddy

Posted

I know there is a free software called "virtual pdf printer", which can print everything into a pdf document file virtually. You can just use it as a real laser or ink printer but without papers printed out.

Since pdf document won't be limited by the platform difference so you can use a virtual printer to make your own pdf docment after you edit the document in MS Word, MS PowerPoint, Mac OS Word.

I am not sure whether there is a virtual printer for Mac OS for downloading. Maybe you can go to some Mac OS forum for help.

Posted

The current plan is to produce PDF files using PHP, which is apparently quite easy if you know what you're doing. At the moment I'm looking at having the first three levels of HSK vocab in a MySQL database, and that'll be searchable (ie show me all level 2 nouns, or words that have the character 安 in them) and output initially will be to a webpage which can be printed off. The PDF thing will come later, if and when I figure out how to do it.

Later I'd like to expand this to include more specialised / higher-level vocabulary.

Roddy

Posted

Right, I spent a few hours playing with this today. At the moment I think it will be possible to search the database by

1) character - either for an exact match (ie 安 returns 安 alone) and for a fuzzy match (ie 安 returns 安排,安静,平安, etc)

2) by pinyin, with or without tones, exact or fuzzy

3) by tone pattern (ie all words of first tone then fourth tone)

4) by certain sets - ie 'food vocab' or 'HSK level 1 vocab.

It'll also be possible to customise the output - ie, do you want characters or just pinyin, do you want pinyin with tone marks or tone numbers, do you want the english or not . . .

The difficult bit is of course typing all the info in. I think I'll probably do as much as I need for testing myself, then find a helpful typist. . . .

Roddy

Posted

roddy

If you need help on hsk word list, please don't hesitate to tell me.

I think it's my honor to help those who want to learn Chinese.

With the confidence in Chinese culture and history, I think I can bring our web-friends more informations.

Posted

Ok, here's some screenshots of what I've got so far. Very basic of course.

First one: Search for all 2 character words, with characters, pinyin and English displayed. There's an option to display the tones as numbers for those without the right fonts.

screenshot1.gif

Second one: Search for all 2 character phrases with 3rd-4th tone pattern, displaying characters and pinyin only.

screenshot2.gif

Comments?

Roddy

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...