Jump to content
Chinese-Forums
  • Sign Up

Does anybody know if there's a reliable URL format for accessing the ChineseClass101 Vocabulary recordings?


Recommended Posts

Posted

Innovative Language Learning (The Company Behind the *Pod/Class101 series of resources) made their vocabulary audio from their JapanesePod101 product available through a reliable URL format. (e.g., http://assets.languagepod101.com/dictionary/japanese/audiomp3.php?kana=[KANA TRANSCRIPTION HERE]&kanji=[CHINESE CHARACTERS HERE]).

 

I went into the ChineseClass101 site and using the inspector I was able to extract the mp3 for whatever audio I played in the vocabulary tools. However the URLs were all seemingly random numbers.

 

I'm wondering if there is some reliable format available, or if this is all that's available since the JapanesePod101 recordings have been such a  valuable resource to me and others.

Posted

Maybe you can provide a URL for an example page? 

 

Sounds like you could possible solve that with a scriptmonkey/tampermonkey script if you were just looking for an easy way to download individual files automatically.

 

If you are looking for a database of Chinese pronunciations there is one around the internet with about 8000 words. I think they give it away to do whatever you want with. If that's what you are looking for I can see if I can find the URL again, or if you PM me I can send you the files that I should still have on my computer somewhere. 

Posted
Quote

Maybe you can provide a URL for an example page? 

I'm not 100% sure what you mean by this. Could you paraphrase?

 

As to a *monkey script, I'd really rather something that just outright didn't need me to sign in and look stuff up in their dictionary at all. You can access the JapanesePod101 files just by typing in the correctly formatted URL. Obviously the actual URL is some numbered mp3 on an AWS instance like this, but the URL format I showed before gets you to the file you want.

 

I'll PM you about that resource you mentioned though.

Posted
1 minute ago, NinKenDo said:

I'm not 100% sure what you mean by this. Could you paraphrase?

Reading that back I see how poorly written it was. I mean a URL to one of the dictionary pages that has some of the audio files linked to. 

Posted

Ya it seems to be more or less random. It's all numbered but I went to a few in between some of the Chinese one and they were in other languages so it's not even predicable in that way. 

 

It's strange that they store everything in numbered files but allow you to get the audio with a string in the URL. Maybe it's part of an API for their phone app or something (where it would want to get audio for words but doesn't have the file name available). 

 

It says there is a desktop app too (though I can't find it) so that might reveal something if you had it and watched your network traffic. 

 

It's a real pity because they seem to have A LOT of audio there, probably much more than that pack at shtooka.net.

Posted

Because you mentioned using inspector to get the url of mp3, I assume that you know programming. I think the best way to download the mp3 files from random URLs is to create a small program with SeleniumSelenium can automate the operations in the browser (for example "going to next page until it's the last page") and it can also extract whatever URLs from a page. After getting those URLs, we can download them easily.

  • Like 1
  • 1 year later...
Posted

In the last few months I've been playing with Selenium quite a lot, and actually developed a few tools to aid in gathering Chinese materials.

 

I won't post those here as they automate the process of extracting material from some smaller time projects that I think deserve people's financial support.

 

But coming back to this thread, I think a Selenium based extractor is probably the way to go. If I come up with something I will post here.

  • 1 month later...
  • 2 weeks later...

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...