Jump to content
Chinese-Forums
  • Sign Up

Learning with DVD questions


leosmith

Recommended Posts

Maybe you have some Chinese keywords to google .. (or to baidu)?

This thread - which you're aware of contains links to the e-book & audio book. I guess the title + author/leading actors names would be the main keywords to search on...

Also there are old threads here giving plot summaries...

I've attached some reformatted, tab delimited versions of your files. They should make correction/collaboration easier.

Episode 1 I crashes the Python script (CSDproc) I used for some reason*. Perhaps someone more computer-literate than me can suggest a fix??? Also, could other people have a go at using these in the ZDT annotater? It doesn't seem to work on them, which is a real drag...

* Traceback (most recent call last):

File "C:/Python25/CSDproc1.py", line 7, in

rawinputline = inputfile.readline()

File "C:Python25libcodecs.py", line 610, in readline

return self.reader.readline(size)

UnicodeDecodeError: 'gb2312' codec can't decode bytes in position 2-3: illegal multibyte sequence

divorce-02p.txt

divorce-03p.txt

divorce-04p.txt

divorce-05p.txt

CSDproc.txt

Link to comment
Share on other sites

Hi onebir

I used MY subtitle file divorce-02.txt with the videofile and produced the picture (see below) with the subtitle in the middle .. (which needs no correction).

I used YOUR subtitle file with the videofile ... and cannot "see" the subtitle anymore .. :(

That means you've changed the structure of the data.

1129_thumb.attach

Link to comment
Share on other sites

That means you've changed the structure of the data.

Yes, I changed it from GB2312, to unicode (UTF-8) - I thought more software would be compatible with that - and changed the layout to make it easier to edit manually (as I said).

I suspect either change would be enough to prevent your player from recognising the subs, but you could try GB2312 encoded version in the same format I posted before.

If that's not it, it's the layout. It's not a really problem. We can use the easy to read format for editing, and - because the easy-edit format preserves the line numbers - then I can write another script to put the edited files back in SRT format (pulling the times out of the original files).

divorce-02pgb.txt

Link to comment
Share on other sites

Hi onebir

UTF is not yet a standard but a headache. :)

The subtitle has the SubRip format. It is a kind of standard.

Many software videoPlayers like VideoLan, Window Media Player etc can use that format for subtitle with the extension srt.

I will use f.e. as pinyin subtitle .. to train/test my ears .. etc.

In Chinese the context is very very important! What is context in this sense...!?

Sometimes the OCR-software cannot "see" a character-line. Sometimes it sees too long .. double.

Not the original number of the subtitle-line but the time stamp is important.

I understand that for you it is easy to synchronize the character-line with the video afterwards? Well ... optimist! :)

Link to comment
Share on other sites

What for is the file divorce-02pgb.txt ..?

It's a GB2312 encoded version of divorce-02p.txt ;-) to check if just changing the encoding back would work.

Since it doesn't, I'll write a script to convert the tab delimited format back to (GB encoded) SRT format...

Link to comment
Share on other sites

Just a note:

I bought "Chinese Style Divorce" as a 3-DVD set in the Wangfujin large bookshop for rmb48

Haven't looked at it yet, but my friends told me afterwards it's not funny. I start to get some buyers remorse...

Luckily I also bought 好想好想 - 谈恋爱 . It's a 4-DVD set. I paid 90 in a small shop, but say it later in WFJ for 58, well.... Anyway, as I checked many shops I I knew it's a hard-2-find DVD and I took no chances.

This will keep me busy till 2012

PS: I know you can find the rips on the web, but mostly only as RMVB format, which my DVD player can't play - and i still like to use TV... (other then that the quality is often gruesome)

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...