Are there any good AI audio transcription services?

June 25, 2019 at 04:57 PM

I wonder if there are any good AI audio transcription services (audio to text) out there?

For example, I realised that the old transcripts for the Ximalaya podcasts (https://www.ximalaya.com/) are actually AI transcriptions.

So far, I have found (?): https://sonix.ai/languages/transcribe-chinese-mandarin-audio

Any experience?

June 25, 2019 at 05:28 PM

Google and Microsoft all have products that can do this (and Amazon is working on it too), but require programming skills in order to make use of them.

June 25, 2019 at 06:12 PM

43 minutes ago, imron said:

but require programming skills in order to make use of them.

So, I guess I am looking for user-friendly ones for Dummies without programming skills ?

June 26, 2019 at 12:13 AM

6 hours ago, Jan Finster said:

So, I guess I am looking for user-friendly ones for Dummies without programming skills ?

And I think I can recommend something just like that for you!

Have no fear. I am no programmer but I was able to pull it off after reading the steps outlined in the blog post.

https://auphonic.com/blog/2016/12/02/make-podcasts-searchable-speech-to-text/

June 26, 2019 at 10:57 AM

I've tried Sonix and Google. The sonix one was user friendly but the pricing seemed quite dear. The Google API was cheaper but requires programming skills. Both of them still seemed to create a trascript which had most of the words in the audio but enough errors to make the text incomprehesible unless you already know what the text is supposed to say. What comes out seems to be more a starting point which humans have to fix up.

It can still be quite useful to have a bad transcript though, makes it easier to look up words you don't know in Pleco clip reader. The program I wrote for the Google API inserts a timecode every 10 seconds so I can find where I am in the transcript based on how far through the audio I am.

June 26, 2019 at 02:55 PM

Anyone tried Xunfei API? They're supposed to be the leader in this field.

June 26, 2019 at 04:37 PM

If you've got access to a reasonably up-to-date iPhone or iPad, Apple's SFSpeechRecognizer API does a decent job with Chinese, works offline, is quite easy to use, and is totally free. (does require coding to use, but somebody may have written a free transcriber app using it by now)

June 27, 2019 at 12:43 AM

8 hours ago, mikelove said:

works offline

Are you sure? From the docs you linked to

Apple said:

Be prepared to handle failures caused by speech recognition limits. Because speech recognition is a network-based service, limits are enforced so that the service can remain freely available to all apps

They mention that some languages require an Internet connection (implying that perhaps some languages don't), is there a way to tell which ones do or do not?

June 27, 2019 at 01:01 AM

It depends on the device and the language, but I know from firsthand testing that on a newish iPhone it will do Chinese offline.

June 27, 2019 at 01:12 AM

What language do you have your UI set to? I wonder if there's any connection with that.

June 27, 2019 at 02:46 AM

US English, but I don't believe there's a connection. Actually they added an API in iOS 13 to detect whether or not a recognizer (initialized with a specific locale) supports on-device recognition.

April 9, 2020 at 09:40 AM

I recently read about the idea of using a virtual audio cable. If I am not mistaken, this would basically connect your audio (e.g. from Youtube) directly to your listening device (e.g. googletranslator). Has anyone here got the tech skills to set up such a thing?

I found this product online, but the full version costs 49$ ?: https://www.vb-audio.com/Cable/

April 9, 2020 at 10:21 AM

On 6/26/2019 at 5:37 PM, mikelove said:

(does require coding to use, but somebody may have written a free transcriber app using it by now)

A quick look on the Appstore, and there are at least 3 transcription apps which look to have launched in the last year or so. Not tried any, but presumably worth a look.

April 17, 2020 at 02:48 PM

Today I have tested this automatic transcription service: https://www.happyscribe.co/

Once you register, you get one 30 minute transcription for free.

I uploaded a recording from a medical seminar. The audio quality was OK, but not great. The speaker (= non-professional translator) was from 四川.

Still, the result was surprisingly OK. Since it is an automatic transcription service, there are obvious limitations, e.g. sometimes they used the wrong character such as 再 instead of 在. So far, I get the impression 90-95% is correct

The transcription took about 10 minutes.

I wonder if a professional human transcription service is much better when it comes to technical texts at that rate (?)

1 hour costs 12$, which, to me, is fair especially since human transcription services I checked charge 2$/min.

Sign In

Are there any good AI audio transcription services?

Recommended Posts

Jan Finster

imron

Jan Finster

pon00050

simc

Publius

mikelove

imron

mikelove

imron

mikelove

Jan Finster

roddy

Jan Finster

Join the conversation