pon00050 Posted March 1, 2017 at 05:09 PM Report Posted March 1, 2017 at 05:09 PM I recently finished watching all episodes of 欢乐颂 and decided that there is a lot of good vocabs that I can learn from this drama. So I started looking for subtitles. My Chinese friend sent me the contact information for a Wechat account that sells subtitles. I purchased the subtitles in word format for all 42 episodes. I was originally trying to make use of the Sub2srs program to generate Anki cards and even spent hours tediously marking the time information for each line in the first episode. If this were the only way to go about it, I would have done it this way. But, thankfully, I discovered that I can use the Chinese Text Analyser program, which is developed by one of the admins for this forum named Imron, to facilitate the process. Imron kindly worked with me to come up with a very useful script. The script collects all the unknown words, sentences in which they are used, English definitions and turns them into a file ready to be imported by Anki. This way, I can generate hundreds of Anki cards in a matter of few seconds. After the cards are imported into Anki, I use the Anki plugin named Chinese Support to automatically fill pinyin and generate automatic voice recording of the Chinese word on each card. How I am studying these days I first download the mp3 version of the 欢乐颂 episode that I will be studying using http://peggo.tv/ Then, I open the script in word format and copy and paste the text into Chinese Text Analyser. I play the mp3 file and follow along identifying every single new word. This part took around 1 hour for the first two episodes and more for the third episode because the third episode contained scenes in which characters were having meetings and I was bombard with professional vocabs. Then, I run the script, import the cards into Anki, fill the rest of the information I want to see on the cards. I study 50 new cards and review 100 cards on Anki per day. When I exhaust the new cards, I move on to the next episode. Images for referencehttp://imgur.com/a/tSJgE Script for download (Dropbox)https://www.dropbox.com/s/2xz5r2ut9oz76za/anki-export.lua?dl=0 2 Quote
Flickserve Posted March 1, 2017 at 11:02 PM Report Posted March 1, 2017 at 11:02 PM 5 hours ago, pon00050 said: After the cards are imported into Anki, I use the Anki plugin named Chinese Support to automatically fill pinyin and generate automatic voice recording of the Chinese word on each card. So, you don't have any voice recordings of the actors speaking on your Anki cards? Quote
pon00050 Posted March 1, 2017 at 11:58 PM Author Report Posted March 1, 2017 at 11:58 PM On 3/1/2017 at 6:02 PM, Flickserve said: So, you don't have any voice recordings of the actors speaking on your Anki cards? I do not. If I persisted and somehow succeeded using the subs2srs program, my Anki cards would have had voice recordings of the actors speaking. http://subs2srs.sourceforge.net/ I figured that the time that I would save outweighs the benefit of having voice recordings of the actors speaking on the Anki cards. 1 Quote
mlescano Posted March 2, 2017 at 12:53 AM Report Posted March 2, 2017 at 12:53 AM I like people finding their own ways of studying. When you do it your way, you're more motivated. There's some sense of discovery, adventure, and invention that can't be replicated when you just download a premade Anki deck. Still, if you could have timed subtitles for that show, it would be great, as then you could generate cards with subs2srs containing audio clips and screenshots. Some ideas: -It it this one? https://www.youtube.com/watch?v=4wGpu56WQGQ If it is, this YouTube video already have synced subtitles... In English. You can download the subtitles as .SRT, open it in SubtitleEdit and take advantage of the existing timings by just copying and pasting the Chinese ones. -This is very boring work, so you could find someone on fiverr willing to do it. Quote
pon00050 Posted March 2, 2017 at 01:00 AM Author Report Posted March 2, 2017 at 01:00 AM On 3/1/2017 at 7:53 PM, mlescano said: Still, if you could have timed subtitles for that show, it would be great, as then you could generate cards with subs2srs containing audio clips and screenshots. Some ideas: Completely agreed. On 3/1/2017 at 7:53 PM, mlescano said: -It it this one? https://www.youtube.com/watch?v=4wGpu56WQGQ If it is, this YouTube video already have synced subtitles... In English. You can download the subtitles as .SRT, open it in SubtitleEdit and take advantage of the existing timings by just copying and pasting the Chinese ones. I saw. I tried doing that. On 3/1/2017 at 7:53 PM, mlescano said: -This is very boring work, so you could find someone on fiverr willing to do it. I didn't go as far as hiring someone to do it because I couldn't afford it. On 3/1/2017 at 0:09 PM, pon00050 said: I was originally trying to make use of the Sub2srs program to generate Anki cards and even spent hours tediously marking the time information for each line in the first episode. The idea that you suggested is exactly what I was talking about in this part of the original post. Quote
Flickserve Posted March 2, 2017 at 01:10 AM Report Posted March 2, 2017 at 01:10 AM 7 hours ago, pon00050 said: purchased the subtitles in word format for all 42 episodes. Would like to ask how much it cost overall? How many words? Quote
pon00050 Posted March 2, 2017 at 01:13 AM Author Report Posted March 2, 2017 at 01:13 AM On 3/1/2017 at 8:10 PM, Flickserve said: Would like to ask how much it cost overall? How many words? After adding the Wechat account, you can see a page that lists the prices. For drama scripts, it's 10 rmb per episode. How many words for what exactly? The word file for first episode has around 9000 words. If you are asking how many words are there for all episodes, I am going to have to get back to you on that next time. Quote
Flickserve Posted March 2, 2017 at 07:57 AM Report Posted March 2, 2017 at 07:57 AM Thanks. That means each episode is going to be around 8-9000 words. I was not asking about the total number of words in a whole complete series. Certainly an intriguing prospect. Quote
艾墨本 Posted March 3, 2017 at 05:30 AM Report Posted March 3, 2017 at 05:30 AM Would you mind sharing the Anki decks for this? Quote
imron Posted March 3, 2017 at 05:55 AM Report Posted March 3, 2017 at 05:55 AM As the person who wrote the lua script to do this, keep in mind the decks from each episode would only contain words that pon00050 didn't know. Quote
艾墨本 Posted March 3, 2017 at 06:08 AM Report Posted March 3, 2017 at 06:08 AM I'd be okay with that. 1 Quote
Flickserve Posted March 3, 2017 at 07:21 AM Report Posted March 3, 2017 at 07:21 AM 1 hour ago, 艾墨本 said: I'd be okay with that You really know how to make me feel inadequate. 1 Quote
pon00050 Posted March 3, 2017 at 01:51 PM Author Report Posted March 3, 2017 at 01:51 PM 8 hours ago, 艾墨本 said: Would you mind sharing the Anki decks for this? I attached the Anki deck. I am not sure how useful this will be to other members. 7 hours ago, imron said: As the person who wrote the lua script to do this, keep in mind the decks from each episode would only contain words that pon00050 didn't know. This is exactly the case. Also, I will not be coming back to this thread again and again to share the updated Anki deck. 欢乐颂 Shared with Chinese-Forums.apkg Quote
wensente Posted March 4, 2017 at 07:49 AM Report Posted March 4, 2017 at 07:49 AM I just finished the 42nd episode as well, and near the end I even started liking it! (I was definitely not the target audience for this show). Very excited to try the pon0050 method! Quote
pon00050 Posted March 4, 2017 at 03:31 PM Author Report Posted March 4, 2017 at 03:31 PM A couple of members have contacted me asking whether I would be interested in sharing the transcriptions. I am not interested in sharing those with anyone. If you are interested in getting them, directly contact the Wechat account that I included in the original post On 3/1/2017 at 0:09 PM, pon00050 said: Images for referencehttp://imgur.com/a/tSJgE 1 Quote
艾墨本 Posted March 5, 2017 at 02:33 AM Report Posted March 5, 2017 at 02:33 AM On 3/3/2017 at 9:51 PM, pon00050 said: Also, I will not be coming back to this thread again and again to share the updated Anki deck. 欢乐颂 Shared with Chinese-Forums.apkg Thank you. I wanted to see it so that I can see what the result of your money, time, and effort produces and decide if I want to go a similar route. I'm glad it's working for but after seeing what all this produces, I think I'll continue creating cards the way I am currently. I hope to work up to 欢乐颂 at some point. I already got the books (which is when I realized I still need to learn more before I read them) and have already watched the entire show. I really loved it. There is also a wonderful audio book version on ximalaya. Colorfully narrated. Quote
pon00050 Posted March 5, 2017 at 03:04 AM Author Report Posted March 5, 2017 at 03:04 AM 30 minutes ago, 艾墨本 said: I'm glad it's working for but after seeing what all this produces, I think I'll continue creating cards the way I am currently. What is it that you wanted to see ideally? How do you create cards currently? Quote
艾墨本 Posted March 6, 2017 at 07:42 AM Report Posted March 6, 2017 at 07:42 AM I'm using CTA to identify the most common words that I don't know and exporting just the characters for Anki. After importing to Anki, I use "Pinyin Toolkit" to fill missing card data (though the audio files you are using seem better than what Pinyin Toolkit provides). From there, I add sentences as I learn new words, copy and pasting them from the text. I prefer doing this manually because I like making sure just enough context is provided for the word so that I can link it into the broader story. Sometimes I only need half the sentence while other times I need an additional sentence on each side. Reading what you wrote again, I'm confused about one step when you describe what you are currently doing: On 3/2/2017 at 1:09 AM, pon00050 said: I play the mp3 file and follow along identifying every single new word. This part took around 1 hour for the first two episodes and more for the third episode In contrast to what you say before that: On 3/2/2017 at 1:09 AM, pon00050 said: The script collects all the unknown words, sentences in which they are used, English definitions and turns them into a file ready to be imported by Anki. This way, I can generate hundreds of Anki cards in a matter of few seconds. Does the script's current setup require you to spend an hour per episode, or is that something else? Quote
imron Posted March 6, 2017 at 08:44 AM Report Posted March 6, 2017 at 08:44 AM My understanding is that pon00050 goes through each episode, listening carefully, and following along in CTA to manually mark words that are known and unknown. This would ensure that the list of known/unknown words is 100% accurate for each episode, but obviously that takes time. Once that is done, running the script takes a matter of seconds. 1 Quote
pon00050 Posted March 6, 2017 at 08:47 AM Author Report Posted March 6, 2017 at 08:47 AM 1 hour ago, 艾墨本 said: Reading what you wrote again, I'm confused about one step when you describe what you are currently doing: Happy to answer! 3 minutes ago, imron said: My understanding is that pon00050 goes through each episode, listening carefully, and following along in CTA to manually mark words that are known and unknown. Also happy that someone else has already answered it for me! I confirm that Imron's understanding is correct. Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.