BertR Posted July 20, 2010 at 06:17 PM Report Posted July 20, 2010 at 06:17 PM Hello everybody, To help me with my Chinese I've written some software recently that detects pauses in an audio file and allows to increase the length of these pauses which gives me more time to interpret sentence in Chinese. Another feature is that every part between these pauses can be repeated a number of times, for example to hear every sentence three times. I tried it on a number of files and surprisingly (since I used a pretty basic algorithm and didn't do any filtering) it is working pretty well on these. Nevertheless if there is background noise or music on the background of the recording it will probably not detect these pauses (for example the recordings of Slow Chinese give problems due to the background music). I was wondering whether somebody is interested in this software. It is working for me, but before other people can use it I should probably spend some more time on it to make it a little bit more user-friendly. It currently only works on Windows, it is a command-line executable and you have to specify the options as arguments to the program, it doesn't support mp3s (the library I use doesn't support mp3s due to licensing issues. See http://en.wikipedia.org/wiki/MP3#Licensing_and_patent_issues in case you're interested in these licensing issues). Anyway for example ogg can be used as an alternative file format (there are many converters from mp3 to ogg). Since I already have a more than full-time job, study Chinese and have a girlfriend to keep happy, I will probably also won't have a lot of time to give support 3 Quote
muirm Posted July 21, 2010 at 04:57 AM Report Posted July 21, 2010 at 04:57 AM What is it written in? I'd be interested in seeing the source code just out of curiosity. It sounds like an interesting problem to poke at. Quote
BertR Posted August 20, 2010 at 01:31 PM Author Report Posted August 20, 2010 at 01:31 PM Ok, I have extended the tool and made it more user friendly. It can be downloaded from http://test.nescio.info/AudioSegmenterSetup.msi You can either generate a new audio file where language segments are repeated and/or pauses are added, or there is more interactive way of working where the audio is played within the tool and a segment is repeated until you push the next button. I hope the application more or less speaks for its own. Normally you should only use the basic options. The advanced options should only be used when the segmentation is not good enough. For me the segmentation really works well for recordings such as those of Graded Chinese Reader, Boya and so on, which are high quality recordings with good punctuation and not too fast. In case there is background noise or people speak really fast you might want to play the the advanced options (first try taking a higher threshold and if that doesn't work try lowering the frame size). Consider this as a beta release. It is not extensively tested and has the following know issues: The software follows the local formatting of numbers, but the default values don't. So if you configured Windows to represent a number as "1,0" and not as "1.0", you should enter "1,0", but the default value in the GUI is "1.0". If you don't change these value unexpected results can be expected :-) Unicode is not supported: the audio file names and the paths can't contain Chinese characters The mp3-file format is not supported (but many others are) In the software I expect that all input fields have reasonable values (hence if you input garbage, crashes can happen) You can update certain parameters while the audio is being played in the interactive mode, but these changes are not applied immediately (they are also not applied to the next segment since the data of the next segment is already created while the current one is played, but the segment after that one) The tool doesn't have a nice icon yet. I tested it on Windows XP (a 32-bit machine) and Windows 7 (a 64-bit machine). I also want to make something similar for videos in the future, but I don't know whether I'll have the time. Tomorrow I will go to Beijing (first time to China) and will stay there for more than a month. While in China I'm not able to work on this tool, so don't expect a new version the coming month and a half. Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.