Jump to content
Chinese-Forums
  • Sign Up

How I study 欢乐颂 scripts using the Anki cards automatically generated by Chinese Text Analyser using Lua script


pon00050

Recommended Posts

I recently finished watching all episodes of 欢乐颂 and decided that there is a lot of good vocabs that I can learn from this drama.

So I started looking for subtitles. My Chinese friend sent me the contact information for a Wechat account that sells subtitles.

I purchased the subtitles in word format for all 42 episodes. I was originally trying to make use of the Sub2srs program to generate Anki cards and even spent hours tediously marking the time information for each line in the first episode. If this were the only way to go about it, I would have done it this way. But, thankfully, I discovered that I can use the Chinese Text Analyser program, which is developed by one of the admins for this forum named Imron, to facilitate the process. Imron kindly worked with me to come up with a very useful script. The script collects all the unknown words, sentences in which they are used, English definitions and turns them into a file ready to be imported by Anki. This way, I can generate hundreds of Anki cards in a matter of few seconds. After the cards are imported into Anki, I use the Anki plugin named Chinese Support to automatically fill pinyin and generate automatic voice recording of the Chinese word on each card.

 

 

 

How I am studying these days

 

I first download the mp3 version of the 欢乐颂 episode that I will be studying using http://peggo.tv/

Then, I open the script in word format and copy and paste the text into Chinese Text Analyser.

I play the mp3 file and follow along identifying every single new word. This part took around 1 hour for the first two episodes and more for the third episode because the third episode contained scenes in which characters were having meetings and I was bombard with professional vocabs.

Then, I run the script, import the cards into Anki, fill the rest of the information I want to see on the cards.

I study 50 new cards and review 100 cards on Anki per day.

When I exhaust the new cards, I move on to the next episode.

 

 

Images for reference
http://imgur.com/a/tSJgE
 

Script for download (Dropbox)
https://www.dropbox.com/s/2xz5r2ut9oz76za/anki-export.lua?dl=0

 

  • Like 2
Link to comment
Share on other sites

 

5 hours ago, pon00050 said:

After the cards are imported into Anki, I use the Anki plugin named Chinese Support to automatically fill pinyin and generate automatic voice recording of the Chinese word on each card.

 

So, you don't have any voice recordings of the actors speaking on your Anki cards?

Link to comment
Share on other sites

On 3/1/2017 at 6:02 PM, Flickserve said:

So, you don't have any voice recordings of the actors speaking on your Anki cards?

I do not.

If I persisted and somehow succeeded using the subs2srs program, my Anki cards would have had voice recordings of the actors speaking.

http://subs2srs.sourceforge.net/

 

I figured that the time that I would save outweighs the benefit of having voice recordings of the actors speaking on the Anki cards.

  • Like 1
Link to comment
Share on other sites

I like people finding their own ways of studying. When you do it your way, you're more motivated. There's some sense of discovery, adventure, and invention that can't be replicated when you just download a premade Anki deck. :tong

 

Still, if you could have timed subtitles for that show, it would be great, as then you could generate cards with subs2srs containing audio clips and screenshots. Some ideas:

 

-It it this one? https://www.youtube.com/watch?v=4wGpu56WQGQ

If it is, this YouTube video already have synced subtitles... In English. You can download the subtitles as .SRT, open it in SubtitleEdit and take advantage of the existing timings by just copying and pasting the Chinese ones.

 

-This is very boring work, so you could find someone on fiverr willing to do it.

Link to comment
Share on other sites

On 3/1/2017 at 7:53 PM, mlescano said:

Still, if you could have timed subtitles for that show, it would be great, as then you could generate cards with subs2srs containing audio clips and screenshots. Some ideas:

 

Completely agreed.

 

On 3/1/2017 at 7:53 PM, mlescano said:

-It it this one? https://www.youtube.com/watch?v=4wGpu56WQGQ

If it is, this YouTube video already have synced subtitles... In English. You can download the subtitles as .SRT, open it in SubtitleEdit and take advantage of the existing timings by just copying and pasting the Chinese ones.

 

I saw. I tried doing that.

 

On 3/1/2017 at 7:53 PM, mlescano said:

-This is very boring work, so you could find someone on fiverr willing to do it.

 

I didn't go as far as hiring someone to do it because I couldn't afford it.

On 3/1/2017 at 0:09 PM, pon00050 said:

I was originally trying to make use of the Sub2srs program to generate Anki cards and

even spent hours tediously marking the time information for each line in the first episode.

 

The idea that you suggested is exactly what I was talking about in this part of the original post.

Link to comment
Share on other sites

On 3/1/2017 at 8:10 PM, Flickserve said:

Would like to ask how much it cost overall? How many words?

After adding the Wechat account, you can see a page that lists the prices.

For drama scripts, it's 10 rmb per episode.

How many words for what exactly?

The word file for first episode has around 9000 words.

If you are asking how many words are there for all episodes, I am going to have to get back to you on that next time.

Link to comment
Share on other sites

Thanks. That means each episode is going to be around 8-9000 words. I was not asking about the total number of words in a whole complete series.

 

Certainly an intriguing prospect.

Link to comment
Share on other sites

8 hours ago, 艾墨本 said:

Would you mind sharing the Anki decks for this?

 

I attached the Anki deck.

I am not sure how useful this will be to other members.

7 hours ago, imron said:

As the person who wrote the lua script to do this, keep in mind the decks from each episode would only contain words that pon00050 didn't know.

This is exactly the case.

 

Also, I will not be coming back to this thread again and again to share the updated Anki deck.

欢乐颂 Shared with Chinese-Forums.apkg

Link to comment
Share on other sites

A couple of members have contacted me asking whether I would be interested in sharing the transcriptions.

I am not interested in sharing those with anyone.

If you are interested in getting them, directly contact the Wechat account that I included in the original post

On 3/1/2017 at 0:09 PM, pon00050 said:

Images for reference
http://imgur.com/a/tSJgE
 

 

  • Like 1
Link to comment
Share on other sites

On 3/3/2017 at 9:51 PM, pon00050 said:

Also, I will not be coming back to this thread again and again to share the updated Anki deck.

欢乐颂 Shared with Chinese-Forums.apkg

 

 

Thank you. I wanted to see it so that I can see what the result of your money, time, and effort produces and decide if I want to go a similar route. I'm glad it's working for but after seeing what all this produces, I think I'll continue creating cards the way I am currently.

 

I hope to work up to 欢乐颂 at some point. I already got the books (which is when I realized I still need to learn more before I read them) and have already watched the entire show. I really loved it. There is also a wonderful audio book version on ximalaya. Colorfully narrated. 

Link to comment
Share on other sites

30 minutes ago, 艾墨本 said:

I'm glad it's working for but after seeing what all this produces, I think I'll continue creating cards the way I am currently.

 

What is it that you wanted to see ideally?

How do you create cards currently?

Link to comment
Share on other sites

I'm using CTA to identify the most common words that I don't know and exporting just the characters for Anki. After importing to Anki, I use "Pinyin Toolkit" to fill missing card data (though the audio files you are using seem better than what Pinyin Toolkit provides). From there, I add sentences as I learn new words, copy and pasting them from the text. I prefer doing this manually because I like making sure just enough context is provided for the word so that I can link it into the broader story. Sometimes I only need half the sentence while other times I need an additional sentence on each side.

 

Reading what you wrote again, I'm confused about one step when you describe what you are currently doing:

On 3/2/2017 at 1:09 AM, pon00050 said:

I play the mp3 file and follow along identifying every single new word. This part took around 1 hour for the first two episodes and more for the third episode

In contrast to what you say before that:

On 3/2/2017 at 1:09 AM, pon00050 said:

The script collects all the unknown words, sentences in which they are used, English definitions and turns them into a file ready to be imported by Anki. This way, I can generate hundreds of Anki cards in a matter of few seconds.

 

 

 

Does the script's current setup require you to spend an hour per episode, or is that something else?

Link to comment
Share on other sites

My understanding is that pon00050 goes through each episode, listening carefully, and following along in CTA to manually mark words that are known and unknown.

 

This would ensure that the list of known/unknown words is 100% accurate for each episode, but obviously that takes time.

 

Once that is done, running the script takes a matter of seconds.

  • Like 1
Link to comment
Share on other sites

1 hour ago, 艾墨本 said:

Reading what you wrote again, I'm confused about one step when you describe what you are currently doing:

 

Happy to answer!

3 minutes ago, imron said:

My understanding is that pon00050 goes through each episode, listening carefully, and following along in CTA to manually mark words that are known and unknown.

 

Also happy that someone else has already answered it for me!

I confirm that Imron's understanding is correct.

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...