simc Posted April 22, 2015 at 08:32 AM Report Share Posted April 22, 2015 at 08:32 AM A few weekends ago I wrote a piece of software to help listen to audiobooks. At the moment it is in a state which works for me, but I'm wondering if other people would be interested. If they are, I may invest some time polishing it and releasing it. Basically, a lot of my Chinese study nowadays is listening to audiobooks and trying to figure out what I can't understand using the text and Pleco reader. First, I would take the text and split it into files so it was aligned with the episodes in the audiobook. Then, while commuting to and from work by bus I would listen with a MP3 player in one hand and read along in Pleco with my phone in the other. The problem was that each audiobook episode is about 25 minutes long and if you stop paying attention for a while you have no idea where you are in the text. The solution to this problem is to split up the audiobook into smaller chunks and mark where each place where the chunks begin and end in the text. That way, you don't have to read along with the audio to know where you are, as you know what fragment you are up to and it is easy to find your place within the small fragment. Wouldn't that quite labour intensive though? That's the purpose of the software I've written. The task is still manual but it makes it much faster. A fragment of text is displayed on the screen and you hit the enter key when the voice reaches the end. Based on that, the MP3 file is split up and points in the text is marked. To save time, the audio can be fast forwarded to approximately where it would end based on the length of the fragment of text. If you want to break an audiobook episode which is about 25 minutes long into fragments 40 seconds long, that might be perhaps about 5-10 minutes work. Here is a demonstration I've prepared. It is the first 10 minutes of 5th track of the audiobook of 平凡的世界. https://www.dropbox.com/s/3i2k13gl0h042xn/demo.zip?dl=0 So, what do you think? 1 Quote Link to comment Share on other sites More sharing options...
uni419 Posted April 24, 2015 at 04:14 AM Report Share Posted April 24, 2015 at 04:14 AM This sounds awesome!! I've been looking for more convenient ways to breakup audio from episodes of 锵锵三人行 so that i can use them for sentence level shadowing/drilling and I think this could very easily fit the bill. Three quick questions for you 1. How does the software determine how to segment the "fragment of text displayed on the screen"? Looking at the dropbox file it looks like it does it by [enter] breaks but i'm wondering if it could be reconfigured to go off of periods for sentence level breakups. 2. Does it work on OSX (not a personal concern since I've got a windows VM but something i'm guessing others will ask) 3. Would you be willing to release an unpolished version for people who'd be interest in using it as is? (myself) Best, Uni Quote Link to comment Share on other sites More sharing options...
simc Posted April 25, 2015 at 02:22 AM Author Report Share Posted April 25, 2015 at 02:22 AM 1. How does the software determine how to segment the "fragment of text displayed on the screen"? Looking at the dropbox file it looks like it does it by [enter] breaks but i'm wondering if it could be reconfigured to go off of periods for sentence level breakups. Well the text starts off as being normally one line per paragraph. The software has a minimum length you can set, after that is reached the text is split at the next sentence end (i.e the next full stop, exclamation mark or question mark). After this spiting is done, the software can optionally try to join fragments which are very short into bigger ones. It is possible split audio by silence. But I'm not sure how to automatically split the text by silences in the audio. Might be possible, but that sounds a bit more like a research topic... 2. Does it work on OSX (not a personal concern since I've got a windows VM but something i'm guessing others will ask) I've written it on Linux and it should work on OSX. I can't test it as I don't have a Mac. The main program is Python and should work on Windows, Mac and Linux. The bit that does the splitting of the MP3 is a bash script so I don't see why it shouldn't work. However, it would not work on Windows unless I rewrite it in Python, which I can certainly do if there is interest. The underlying program which splits the file is avconf from libav which should be available on Macs. 3. Would you be willing to release an unpolished version for people who'd be interest in using it as is? (myself) Certainly. However, you will be expected to be "good with computers" (know how to install software, edit text files, run commands from the command line etc) and expect it might crash annoyingly at the wrong time. EDIT: I've gone away for the long weekend (It's Anzac weekend here, but i'll do this next week). 1 Quote Link to comment Share on other sites More sharing options...
uni419 Posted April 25, 2015 at 10:17 AM Report Share Posted April 25, 2015 at 10:17 AM Certainly. However, you will be expected to be "good with computers" (know how to install software, edit text files, run commands from the command line etc) and expect it might crash annoyingly at the wrong time. Well, I don't fit any of the above prereqs, but google's there and i'm generally pretty persistent. EDIT: I've gone away for the long weekend (It's Anzac weekend here, but i'll do this next week). Hope you enjoy the weekend. Quote Link to comment Share on other sites More sharing options...
simc Posted May 2, 2015 at 09:34 AM Author Report Share Posted May 2, 2015 at 09:34 AM Okay, I've uploaded this thing now. Go to here: https://github.com/simon-g-crosby/audiobook-cutter Click the Download ZIP button on the right hand side. Now, here is where I drop you in the deep end because I don't have a Mac and so I can't really give you detailed instructions on how to install the things you need. But anyway, here is what you need to have installed. libav - https://libav.org/ avbin http://avbin.github.io/AVbin/Home/Home.html Python https://www.python.org/ Pyglet http://www.pyglet.org/ The font WenQuanYi Micro Hei Mono (or, you can edit the FONT_NAME in the script to set it to a font you already have) Once you have all this installed, then create a directory and unzip the zip file into it. In that directory, place the audio MP3 file you wish to cut up and rename it to input.mp3. Place the matching text in a file input.txt. Then, from a command line run: python audiobook_cutter.py Tell me if you get this far! If you get this far I will give you further instructions. Feel free to ask questions and I will help if I can. I feel a bit guilty here because based on what you have told me I fear getting this set up might be a bit out of your depth. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.