drahnier Posted April 5, 2008 at 12:38 PM Report Posted April 5, 2008 at 12:38 PM After more than one frustrating hour with zdt's import function I'm giving up! I simple don't understand what zdt is complaining about. The file I'd like to import looks simple enough: UTF-8 coded with entries such as 安静 ān jìng quiet; peaceful; calm 安排 ān pái to arrange; to plan; to set up 爸爸 bà ba (informal) father to list the first three. I attached the complete file (zipped) to this post. So the general syntax of items in the file according to zdt should be S TAB P TAB D (see attached image #1). On running import I get a list of parsing errors (see attached image #2). The imported list itself contains a lot of false entries (see attached image #3). Something really screws up the parsing but I can't figure out what it is. Any help would be greatly appreciated. l1w.zip Quote
Quest Posted April 5, 2008 at 02:36 PM Report Posted April 5, 2008 at 02:36 PM Looking at the errors (including 北边), it seems they all contain e3. Perhaps that wasn't the correct e3 character to use? Quote
drahnier Posted April 6, 2008 at 05:56 AM Author Report Posted April 6, 2008 at 05:56 AM Quest: You've got sharp eyes! And yes, after changing all ĕ to ě I could succesfully import this file. Thank you yery much! Quote
BaiYuehan Posted April 14, 2008 at 12:27 PM Report Posted April 14, 2008 at 12:27 PM I'm having the same trials and tribulations with importing data. Am onto version 0.7.0 b3 but it still wont read in my file (in S P D format, tab delimited). It's failing on at least 50% of lines, here's a small sample... Line 8: 我们 tā men they Unable to find in current dictionary. Line 13: 他是美国人吗? Tā shì měi guó rén ma? Is he American? Unable to parse line. I think what I need is an "Ignore all errors on the input" check box Quote
Luobot Posted April 14, 2008 at 04:20 PM Report Posted April 14, 2008 at 04:20 PM Line 8: 我们 tā men theyUnable to find in current dictionary. Line 13: 他是美国人吗? Tā shì měi guó rén ma? Is he American? Unable to parse line. Line 8 -> 我们 is wo3 men5, not ta1 men5 Line 13 -> Try ZDT's annotator for phrases and sentences. I think what I need is an "Ignore all errors on the input" check box I also think this could be useful, but would want to see the entry flagged so that I researched and corrected it, if necessary. There could be a global option to show or hide the flag. Quote
BaiYuehan Posted April 14, 2008 at 10:49 PM Report Posted April 14, 2008 at 10:49 PM Thanks Luobot - I hadn't even thought it might be checking Char against Pinyin definitions...so thats why it runs so slow ! This is a little toooo thorough for me. I've thousands of lines of data, much of it in sentences. Agree it would be ideal, (for me at least) if discrepancies were just flagged rather than fail to input. Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.