Insectosaurus Posted June 24, 2021 at 05:47 PM Report Share Posted June 24, 2021 at 05:47 PM Hi everyone, So, I currently have an Anki deck where I practice writing characters. I use the Taiwan Ministry of Education stroke order images found here. At the moment I add them all manually one by one by searching for the character in question. However there is a downloadable archive file with all images in it. Problem is I have no way of matching these images with the characters. If I would be able to have a list with one column for the character and one for the image name, if could all be imported in one go. Does anyone know if such a list is possible, possibly someone with a bit more tech competence than I possess. Perhaps they are even named more logically than I think? Here is a link for the image archive, and this is what it looks like unarchived. Quote Link to comment Share on other sites More sharing options...
roddy Posted June 24, 2021 at 06:06 PM Report Share Posted June 24, 2021 at 06:06 PM That looks like an encoding issue on the filenames, but the download's painfully slow so I can't really take a look. 1 Quote Link to comment Share on other sites More sharing options...
Insectosaurus Posted June 24, 2021 at 06:09 PM Author Report Share Posted June 24, 2021 at 06:09 PM 2 minutes ago, roddy said: but the download's painfully slow so I can't really take a look Tell me about it. Yesterday I got thrown out from the site due to it having reached the maximum amount of users, 15(!). 1 Quote Link to comment Share on other sites More sharing options...
roddy Posted June 24, 2021 at 06:14 PM Report Share Posted June 24, 2021 at 06:14 PM You could try uploading a few images (or the whole zip, for that matter, we'd be doing their server a favour). Quote Link to comment Share on other sites More sharing options...
Insectosaurus Posted June 24, 2021 at 06:16 PM Author Report Share Posted June 24, 2021 at 06:16 PM 2 minutes ago, roddy said: You could try uploading a few images (or the whole zip, for that matter, we'd be doing their server a favour). This should work. Quote Link to comment Share on other sites More sharing options...
roddy Posted June 24, 2021 at 06:30 PM Report Share Posted June 24, 2021 at 06:30 PM We will now wait for my poor old laptop to extract 6,000 images... Ok, that seems properly mangled to me. When extracting it complained of having 1000+ files with the same name, which ain't good. Someone might be able to save 'em, but not me. And oh my God, this website. YOU CAN SEE THE BACKGROUND IMAGE LOADING! If you're going to have a chirpy animation of a girl playing the trumpet, while riding a unicycle, make it load faster. Good luck! 1 Quote Link to comment Share on other sites More sharing options...
Insectosaurus Posted June 24, 2021 at 06:38 PM Author Report Share Posted June 24, 2021 at 06:38 PM 4 minutes ago, roddy said: Good luck! I'm thinking if no one here knows how to solve it, it might be worth if for me to just send them an e-mail. Surely they have the correct files somewhere.. 5 minutes ago, roddy said: If you're going to have a chirpy animation of a girl playing the trumpet, while riding a unicycle, make it load faster. Yeah, I don't know how many hours I've spend the last days searching and saving images. I first used https://strokeorder.com.tw/ but noticed a lot of stroke orders differed quite a lot. 1 Quote Link to comment Share on other sites More sharing options...
NinKenDo Posted June 24, 2021 at 11:57 PM Report Share Posted June 24, 2021 at 11:57 PM That looks like Big5 encoding to me (which makes sense since it's from Taiwan), the reason it looks like Big5 is that Big5 is double-byte encoded, whereas many other encodings for Western languages are Single byte. Looks like you're on Windows which generally assigns filesystems with a specific text encoding system, so if it's assigned you a single-byte coding system (highly likely if it's given you a Western encoding system that isn't Unicode) then the encoding of your filesystem is seeing those two bytes of each character as two characters. I'm thinking the overlapping filenames might be due to Windows' backend use of UTF-16. which might merge characters? Dunno. I'm on Linux which uses UTF-8 and has encoding agnostic filesystems, so I will try and see if there's a way to extract the files with UTF-8 filenames, which should then be OK for you to use (because Windows' UTF-16 backend won't mangle them). EDIT: Welp, unless I did something wrong in convmv, these are not BIG5... Guess now I gotta slog through converting them over and over until we hit a match... EDIT: Wow, OK, this is proving really difficult to crack, it's not helping that the symbols in filenames need to be prevented from expanding otherwise commands at various levels of the script I'm trying to write go and try and use them as command arguments. I think this isn't going to be easily solvable in bash because of this, I will take another stab at it with Python or maybe Perl, although I'm much less familiar with Perl. But for now, I have other things that need doing. 1 1 Quote Link to comment Share on other sites More sharing options...
edelweis Posted June 25, 2021 at 06:12 AM Report Share Posted June 25, 2021 at 06:12 AM this script works for me big5unzip.pl 1 2 Quote Link to comment Share on other sites More sharing options...
Insectosaurus Posted June 25, 2021 at 09:34 AM Author Report Share Posted June 25, 2021 at 09:34 AM 5 hours ago, edelweis said: this script works for me Fantastic, thank you! Now, how do I use a .pl script? ? Or perhaps you could upload the fixed files? Quote Link to comment Share on other sites More sharing options...
edelweis Posted June 25, 2021 at 05:54 PM Report Share Posted June 25, 2021 at 05:54 PM I can upload but I need @roddyor @imron to give me more attachment space. It's a 58MB file. Quote Link to comment Share on other sites More sharing options...
roddy Posted June 25, 2021 at 07:08 PM Report Share Posted June 25, 2021 at 07:08 PM That’s odd, as far as I can tell it shouldn’t be an issue. Wonder if it’s a pho settings. Anyway, as you can tell from my spelling of php, I’m on mobile and can’t do anything. Will take a look when I can. Or email to admin@ and I’ll upload directly. Quote Link to comment Share on other sites More sharing options...
edelweis Posted June 26, 2021 at 10:58 AM Report Share Posted June 26, 2021 at 10:58 AM part2.zip part1.zip part3.zip 1 1 Quote Link to comment Share on other sites More sharing options...
edelweis Posted June 26, 2021 at 11:00 AM Report Share Posted June 26, 2021 at 11:00 AM apparently my attachments have a limit of 19.53MB so uploaded above in 3 zips 1 Quote Link to comment Share on other sites More sharing options...
Insectosaurus Posted June 26, 2021 at 11:37 AM Author Report Share Posted June 26, 2021 at 11:37 AM 36 minutes ago, edelweis said: apparently my attachments have a limit of 19.53MB so uploaded above in 3 zips Thanks a million! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.