Jump to content
Chinese-Forums
  • Sign Up

Auto transcribing Mandarin audio files


Recommended Posts

Posted

I've been listening to the lectures on American Mandarin Society podcasts (highly recommended by the way - good for listening skills). 

I was thinking that it might be good to have a transcription to work through for various reasons. Just wanted to check here before I begin - what sort of level would you say the auto transcriptions are at for longer pieces in Mandarin (40 mins + in this case, but in what I consider a very clear, standard 普通话)? Also, any particular methods, software anyone would recommend? Quick and simple would be ideal.

 

I had a quick look on here and the only things of note are a few years old (at least what I could find). I imagine there has been a lot of rapid development so worth another shout. 

 

Thanks. 

Posted

I'm actually playing about with something in this area.  The technology is good, but not great and still has errors.

 

The good news is that a large amount of the researchers working on this stuff are Chinese (either native or of Chinese descent) and so Mandarin is typically the second best supported language after English.

 

Microsoft, Google and Baidu all have Speech to Text APIs that you can use for this, but there may be a cost associated with using them, and you'll need basic programming skills in order to be able to use them.

 

 

  • Like 1
  • Thanks 1
Posted

As a quick follow up, here what Microsoft and Google got for the following brief section of the most recent American Mandarin Society podcast.

 

ams-podcast.mp3

 

Microsoft said:

那么我首先我个人的感受包括我在华盛顿这两个多月来跟美国的一些专家的交流我们有一个共识那就是基本上这种战略大三角中美苏的关系已经不存在了现在的中美关系是比较独立的独立的三组这个双边关系为什么刚才那个讲那个中国关系这个没关系但是他们组关系之间有非常紧密的联系特别是有关系可能? 这次来的就是主要关注整体。 可能我个人一个感受啊大家互需要中感受

 

and

 

Google said:

那么我出现我给你来个人那个感受包括在华盛顿这两个多月来跟美国的一些专家的交流我们有一个共识那就是基本上这种战略大三角笼中美速的关系已经不存在了现在的中美俄关系是比较独立的是独立的三组这个双边关系最为什么刚才那个讲就说要蒋美国关系的中国关系这个美关系统的但是他们这三组关系之间的友友非常紧密的联系特别是有一组关系所以可能会对另外两个关系产生根本性的影响所以我觉得这个议题那也是非常值得学研究的这一次来了就主要关注这一题可能而不不吭的一个感受啊大家或许有种感受就是中俄关系在经济研的答案是比较好的

 

This was taken just by playing the audio on my speakers and using the demo available on their respective websites.  The Baidu one wouldn't transcribe such a long section of text, so it's left out here, but ironically it was the worst of the 3 anyway.

 

  • Like 1
Posted

This is really interesting - thanks. Definitely developments of note. I think we'll be seeing a lot more of this stuff as the tech develops. 

I'll have a little play around with it. 

 

Cheers. 

Posted

Without having any programming skill, I was able to generate transcripts using Auphonic.

Read this blog post titled "Make podcasts searchable speech to text".

The process was easy to execute and the output was pretty decent.

Of course, the technology isn't perfect now and, if you'd like the transcription to be 100% accurate, someone will have to listen to the audio from the beginning to the end while checking for any inaccuracy in the generated transcript.

  • Like 2
Posted

Thanks. Again - I'll certainly take a look and wasn't aware of Auphonic before. 

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...