eslang Posted December 8, 2016 at 09:21 AM Report Share Posted December 8, 2016 at 09:21 AM I came across the Autosub tool from the d-addicts forum. Autosub is a utility for automatic speech recognition and subtitle generation.https://github.com/agermanidis/autosubInstall AutoSub Step to Step in Windows with Translate subtitlehttps://github.com/agermanidis/autosub/issues/31 So I tried out the autosub tool on this program show 锵锵三人行 where realmayo had put a link Transcripts for recent 锵锵三人行 episodes and over at the "any good TV series recently?" topic thread: The holy grail for me I think is to find good shows which also have "soft" subtitles (in Chinese) available,that is, a downloadable text file of subtitles. @realmayo, would you mind sharing with us which gems you have found (the combo mp4/srt files)and where we can download them? 锵锵三人行 doesn't appear to have subtitles at all. Please refer to the attach file (picture) to take a look at the "soft" subtitles. Overall, I find that the autosub tool managed to capture some lines correctly compared to the youtube captions feature. 2 Quote Link to comment Share on other sites More sharing options...
wibr Posted December 8, 2016 at 12:17 PM Report Share Posted December 8, 2016 at 12:17 PM The speech-to-text is done by https://cloud.google.com/speech/ and the API key is hardcoded in the script, so I wonder who will pay for the usage? It's only free for up to 60min per month, after that it's 1.5$ per hour. 1 Quote Link to comment Share on other sites More sharing options...
eslang Posted December 8, 2016 at 02:14 PM Author Report Share Posted December 8, 2016 at 02:14 PM @wibr - Thanks for the additional information, that's certainly good to know and keep in mind. I'm just curious about which program show that you tried it out on and what do you think of the quality of the "soft subtitles"?So far, I have tried out on 1 English documentary, 2 Chinese documentary and 3 Japanese documentary, trailer and drama programs. Most likely I have not hit the 60 minutes per month limitation yet. [Edit] This software tool was installed two days ago and the following program shows were tested: 1 English - Documentary (4mins 4sec)2 Chinese - Documentary (5mins 1sec) 锵锵三人行2 Chinese - Documentary (4mins 58sec) 文明之旅3 Japanese - Documentary (6mins 11sec)3 Japanese - Trailer (1min 39sec)3 Japanese - Drama (51mins 20sec)Total Time : 73mins 13sec The Japanese documentary have around 56% correct phrases, 30% incorrect phrases, and 14% grey-area phrases. The Japanese drama have around 34% correct phrases, 40% incorrect phrases, and 26% grey-area phrases. Correct phrases - 正字 Incorrect phrases - 誤字・脱字 Grey-area phrases - 字余り Quote Link to comment Share on other sites More sharing options...
eslang Posted December 10, 2016 at 04:58 AM Author Report Share Posted December 10, 2016 at 04:58 AM I am not sure why wibr have the API key problem. In any case, I managed to run the software tool smoothly on 4 episodes (about 50min per episode) of Japanese drama yesterday. I have edited the "auto-generated subtitles" for the talk-show 锵锵三人行(about the first 5min) using Aegisub to fine-tune and adjust the subtitle timing, then copy and paste the relevant text into the subtitle line. The transcript can be found in this link below. 中国拍不出好电影 这事能怪小鲜肉吗?_凤凰卫视http://phtv.ifeng.com/a/20161111/44491382_0.shtml The soft-subtitle (srt file) is attached 中国拍不出好电影 这事能怪小鲜肉吗?_凤凰卫视_1.srt Quote Link to comment Share on other sites More sharing options...
wibr Posted December 10, 2016 at 08:19 AM Report Share Posted December 10, 2016 at 08:19 AM I haven't actually tried the software, I just checked the github page and found that the speech api key is hardcoded in the sourcefiles. I am not really familiar with the google apis and how they are billed, so maybe I am missing something here, but according to the website only the first 60min are free. So if you manage to go above the 60min, as far as I understand it, the owner of the api key will have to pay for that. Quote Link to comment Share on other sites More sharing options...
eslang Posted December 15, 2016 at 03:19 AM Author Report Share Posted December 15, 2016 at 03:19 AM @ wibr - It is likely that most of the software developers or owner of the API key (who have engaged the software developer) will have to pay for that amount being billed. In such cases, the software is still relatively unknown and unheard, so it is unlikely that end-users will fork out money to pay for some beta-version or prototype model. Metaphorically speaking, it is also not likely that potential customers are willing to pay for gasoline when they visit the showroom to test drive some new car models.If you (or others) happen to test out the software later, it would be great to hear from you (or other people) about the quality of the auto-generated subtitles. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.