Jump to content
Chinese-Forums
  • Sign Up

ZDT: Importing files into ZDT


Iluvchinese

Recommended Posts

Hello guys!

It's my first time using ZDT, and it appears that it wont let me import this file (see attachment below). It is uft 8 and gato made a post and gave this file to me and suggested that I use it to learn chinese. But it doesn't let me import it. Instead, it says:

"Unable to parse file. Check that this file is valid and you're using the correct element delimiter."

Please help!

MostFrequent2200.txt

Link to comment
Share on other sites

Just goes to show that you shouldn't trust gato :lol:

This file format is not regular enough for zdt to be able to import well. Here's the first couple of lines:

1 的 [de] (grammatical particle) [dì] 目的 mùdì goal [dí] 的确 [dī] cab

2 一 [yī] one; 一定 certain; 一样 same; 一些 some

3 是 [shì] to be

4 不 [bù] not [bú]

5 了 [le] (particle) [liǎo] 了解 comprehend [liào] (=瞭) [liāo] [liáo]

6 人 [rén] person; 人类 rénlèi humankind; 有人吗? anybody here?

7 在 [zài] at; 现在 xiànzài now; 存在 cúnzài exist

8 我 [wǒ] I, me; 我们 wǒmen we

9 有 [yǒu] have; there is; 没有 haven't; 有的 some [yòu] (=又)

10 中 [zhōng] middle; in; 中国 Zhōngguó China [zhòng] hit (a target)

11 这(F這) [zhè] [zhèi] this

12 大 [dà] big; 大家 dàjiā everybody [dài] 大夫 dàifu doctor

13 国(F國) [guó] (国家) country; 中国 China; 美国 USA

There's no delimiter, the format for the traditional form is not regular, and there are multiple pinyin per line.

If you're good with awk or something like that you could probably parse it into a format zdt can understand.

Link to comment
Share on other sites

It's going to be a lot of work. Unless you really want to use this file, you're probably better off looking for other word lists. ZDT's website has some, the HSK word lists are posted on this forum. Or ask Gato if he has the data in another version.

If you really want to use this file, you probably want to write a program that will convert each line into a line that ZDT can understand. I recommend

Chinese CharacterpinyinDefinition

So you'll need to strip off the line number, remove the traditional form (if present), and, for the lines that have more than one pinyin, break it into several lines. Also, for the "example" words, if you want to make them a separate word that's more work.

e.g. the first five lines could be converted into the following. [Where ??? indicates missing pinyin; not sure how to get that.] [And there should be tabs between the fields, not spaces.]

的 de (grammatical particle)

目的 mùdì goal

的确 dī??? cab

一 yī one

一定 ??? certain

一样 ??? same

一些 ??? some

是 shì to be

不 bù not

了 le (particle)

了解 liǎo??? comprehend

Alternatively, you can just pull out all the character, and just import them. If those characters are in zdt's dictionary, it will pull in the pinyin and def for you. That won't help for characters that have more than pronunciation, but it would be easier.

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...