NinKenDo Posted January 14, 2019 at 04:56 AM Report Posted January 14, 2019 at 04:56 AM Innovative Language Learning (The Company Behind the *Pod/Class101 series of resources) made their vocabulary audio from their JapanesePod101 product available through a reliable URL format. (e.g., http://assets.languagepod101.com/dictionary/japanese/audiomp3.php?kana=[KANA TRANSCRIPTION HERE]&kanji=[CHINESE CHARACTERS HERE]). I went into the ChineseClass101 site and using the inspector I was able to extract the mp3 for whatever audio I played in the vocabulary tools. However the URLs were all seemingly random numbers. I'm wondering if there is some reliable format available, or if this is all that's available since the JapanesePod101 recordings have been such a valuable resource to me and others. Quote
markhavemann Posted January 14, 2019 at 09:54 AM Report Posted January 14, 2019 at 09:54 AM Maybe you can provide a URL for an example page? Sounds like you could possible solve that with a scriptmonkey/tampermonkey script if you were just looking for an easy way to download individual files automatically. If you are looking for a database of Chinese pronunciations there is one around the internet with about 8000 words. I think they give it away to do whatever you want with. If that's what you are looking for I can see if I can find the URL again, or if you PM me I can send you the files that I should still have on my computer somewhere. Quote
NinKenDo Posted January 14, 2019 at 02:36 PM Author Report Posted January 14, 2019 at 02:36 PM Quote Maybe you can provide a URL for an example page? I'm not 100% sure what you mean by this. Could you paraphrase? As to a *monkey script, I'd really rather something that just outright didn't need me to sign in and look stuff up in their dictionary at all. You can access the JapanesePod101 files just by typing in the correctly formatted URL. Obviously the actual URL is some numbered mp3 on an AWS instance like this, but the URL format I showed before gets you to the file you want. I'll PM you about that resource you mentioned though. Quote
markhavemann Posted January 14, 2019 at 02:40 PM Report Posted January 14, 2019 at 02:40 PM 1 minute ago, NinKenDo said: I'm not 100% sure what you mean by this. Could you paraphrase? Reading that back I see how poorly written it was. I mean a URL to one of the dictionary pages that has some of the audio files linked to. Quote
markhavemann Posted January 14, 2019 at 02:44 PM Report Posted January 14, 2019 at 02:44 PM For anyone else who is interested, I have found the link to this: Free audio collection of Chinese words (cmn-caen-tan) http://download.shtooka.net/cmn-caen-tan_flac.tar 1 2 Quote
NinKenDo Posted January 14, 2019 at 02:45 PM Author Report Posted January 14, 2019 at 02:45 PM Oh. Seems you can't link to specific queries, or at least, the URL bar isn't automatically filled with a URL to get back to you current query. But you can go to https://www.chineseclass101.com/chinese-dictionary/ and search any piece of reasonably common vocabulary and then hit the play button next to entries. Quote
markhavemann Posted January 14, 2019 at 03:06 PM Report Posted January 14, 2019 at 03:06 PM Ya it seems to be more or less random. It's all numbered but I went to a few in between some of the Chinese one and they were in other languages so it's not even predicable in that way. It's strange that they store everything in numbered files but allow you to get the audio with a string in the URL. Maybe it's part of an API for their phone app or something (where it would want to get audio for words but doesn't have the file name available). It says there is a desktop app too (though I can't find it) so that might reveal something if you had it and watched your network traffic. It's a real pity because they seem to have A LOT of audio there, probably much more than that pack at shtooka.net. Quote
VocabSplitter Posted January 16, 2019 at 09:33 PM Report Posted January 16, 2019 at 09:33 PM Because you mentioned using inspector to get the url of mp3, I assume that you know programming. I think the best way to download the mp3 files from random URLs is to create a small program with Selenium. Selenium can automate the operations in the browser (for example "going to next page until it's the last page") and it can also extract whatever URLs from a page. After getting those URLs, we can download them easily. 1 Quote
NinKenDo Posted January 26, 2020 at 05:51 AM Author Report Posted January 26, 2020 at 05:51 AM In the last few months I've been playing with Selenium quite a lot, and actually developed a few tools to aid in gathering Chinese materials. I won't post those here as they automate the process of extracting material from some smaller time projects that I think deserve people's financial support. But coming back to this thread, I think a Selenium based extractor is probably the way to go. If I come up with something I will post here. Quote
luc9999 Posted March 5, 2020 at 02:24 AM Report Posted March 5, 2020 at 02:24 AM @NinKenDo did you manage to come up with anything in the end? Quote
NinKenDo Posted March 19, 2020 at 09:51 AM Author Report Posted March 19, 2020 at 09:51 AM Not yet, but I just haven't had the time lately. I haven't given up on the prospect. Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.