Iluvchinese Posted February 20, 2009 at 11:26 PM Report Share Posted February 20, 2009 at 11:26 PM Hello guys! It's my first time using ZDT, and it appears that it wont let me import this file (see attachment below). It is uft 8 and gato made a post and gave this file to me and suggested that I use it to learn chinese. But it doesn't let me import it. Instead, it says: "Unable to parse file. Check that this file is valid and you're using the correct element delimiter." Please help! MostFrequent2200.txt Quote Link to comment Share on other sites More sharing options...
jbradfor Posted February 21, 2009 at 04:16 AM Report Share Posted February 21, 2009 at 04:16 AM Just goes to show that you shouldn't trust gato This file format is not regular enough for zdt to be able to import well. Here's the first couple of lines: 1 的 [de] (grammatical particle) [dì] 目的 mùdì goal [dí] 的确 [dī] cab 2 一 [yī] one; 一定 certain; 一样 same; 一些 some 3 是 [shì] to be 4 不 [bù] not [bú] 5 了 [le] (particle) [liǎo] 了解 comprehend [liào] (=瞭) [liāo] [liáo] 6 人 [rén] person; 人类 rénlèi humankind; 有人吗? anybody here? 7 在 [zài] at; 现在 xiànzài now; 存在 cúnzài exist 8 我 [wǒ] I, me; 我们 wǒmen we 9 有 [yǒu] have; there is; 没有 haven't; 有的 some [yòu] (=又) 10 中 [zhōng] middle; in; 中国 Zhōngguó China [zhòng] hit (a target) 11 这(F這) [zhè] [zhèi] this 12 大 [dà] big; 大家 dàjiā everybody [dài] 大夫 dàifu doctor 13 国(F國) [guó] (国家) country; 中国 China; 美国 USA There's no delimiter, the format for the traditional form is not regular, and there are multiple pinyin per line. If you're good with awk or something like that you could probably parse it into a format zdt can understand. Quote Link to comment Share on other sites More sharing options...
Iluvchinese Posted February 21, 2009 at 02:47 PM Author Report Share Posted February 21, 2009 at 02:47 PM And how do I change it into ZDT format?? Please help! Quote Link to comment Share on other sites More sharing options...
jbradfor Posted February 23, 2009 at 05:21 PM Report Share Posted February 23, 2009 at 05:21 PM It's going to be a lot of work. Unless you really want to use this file, you're probably better off looking for other word lists. ZDT's website has some, the HSK word lists are posted on this forum. Or ask Gato if he has the data in another version. If you really want to use this file, you probably want to write a program that will convert each line into a line that ZDT can understand. I recommend Chinese CharacterpinyinDefinition So you'll need to strip off the line number, remove the traditional form (if present), and, for the lines that have more than one pinyin, break it into several lines. Also, for the "example" words, if you want to make them a separate word that's more work. e.g. the first five lines could be converted into the following. [Where ??? indicates missing pinyin; not sure how to get that.] [And there should be tabs between the fields, not spaces.] 的 de (grammatical particle) 目的 mùdì goal 的确 dī??? cab 一 yī one 一定 ??? certain 一样 ??? same 一些 ??? some 是 shì to be 不 bù not 了 le (particle) 了解 liǎo??? comprehend Alternatively, you can just pull out all the character, and just import them. If those characters are in zdt's dictionary, it will pull in the pinyin and def for you. That won't help for characters that have more than pronunciation, but it would be easier. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.