Jump to content
Chinese-Forums
  • Sign Up

How to create your own "transcript" from a Chinese video


Recommended Posts

Posted

Hi All,

I wanted to share a method that I use for extracting the Chinese hard subtitles from AVI, FLV and RMVB videos. Most of the Chinese videos available on Youku, Emule, Youtube et. al. have "hard subs" which means that the subtitles are "burned" directly onto the image and therefore cannot be extracted or opened separately. I use an excellent program called AVISubDetector (freeware available here: http://www.videohelp.com/tools/AVISubDetector). You can also download a guide here: http://tinyurl.com/ylcerzv. Here's a brief overview of how to use AVISubDetector to make "transcripts" out of hard-subbed videos (the guide goes into more detail):

1. Load an AVI file with Chinese subtitles into AVISubDetector (if the original file is RMVB or FLV you'll need to convert it to AVI first using a program such as AVS Video Converter).

2. Under "Settings," crop the image so that it just includes the subtitle area at the bottom of the frame.

3. Under "Project" select an output folder.

4. Click "Start (Full)"

The program will automatically detect the subtitles, crop them, and save them as a series of small BMP image files in a subfolder labeled "SubPic." It's that simple. The subtitle BMPs are also automatically numbered. You can easily batch-insert them into a word-processing document or combine them into a PDF file which will basically give you a "transcript" of the video. You can also drag & drop individual subtitles into ANKI to make flashcards (unfortunately, since the BMPs are image files you cannot paste them directly into a dictionary program like Wenlin). Personally, I use another free program called PdfLrf to combine the subtitle images into a Sony Reader LRF document which I then load in my Sony Reader (a Kindle-type device -- reading computer screens for a long time makes my eyes ache).

Anyway, I hope some of you find this useful. I've attached a sample which actually came from a fairly low-resolution source (RMVB of 家有儿女 downloaded off EMule)

2870_thumb.attach

  • Like 2
Posted

Thanks for that. I can see that being useful.

Posted

Have you tried then running the ,bmp files through OCR to get a soft-copy - that would be very handy indeed.

Posted

OCR would probably have trouble with this kind of input, but I can imagine that even just re-typing this stuff is easier than having to constantly pause, rewind, and all that other stuff that you need to transcribe an episode using the standard method.

Posted

Also, if you could keep track of the timings you could then use it to create soft subtitle files - either of the original or translations. Does it do that? That would also be handy.

Posted

Looks helpful. I'll give it a try. Thanks.

  • 2 weeks later...
Posted

Well I've failed on this right from the start: although I believe I may have loaded an .avi file into the programme, I've got no idea how to get it to show me a frame, so I can mark out the boundaries for the subtitles.

  • 2 weeks later...
Posted

@realmayo

Well I've failed on this right from the start: although I believe I may have loaded an .avi file into the programme, I've got no idea how to get it to show me a frame, so I can mark out the boundaries for the subtitles.

Did you download the guide? Here's the full URL:

http://www.wxfs.org/guides/HowtoRiptheTimingandEnglishSubsFromanAVIfileUsingAviSubDetector.pdf

Especially look at p.5 of the guide where it describes how to use the "Settings" tab, which is where the "Crop Settings" are. You have to slide the Top and Bottom crop settings until they just cover the subtitle region.

  • 5 years later...
Posted

What shall I do to make AVISubDetector work properly ?

 

719cd60caf91.png

 

I thought it might be the space distance that is too small, so I changed it, still no effect...

  • 1 month later...

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...