Moshen Posted January 27, 2022 at 11:01 AM Report Posted January 27, 2022 at 11:01 AM Quote I find non-fiction audiobooks significantly easier than fictions. This is my experience also. I think it's because with fiction, if there's anything fancy or non-chronological going on, you have to both pay attention to the meaning of the sentences as they flow and also to construct the flow of the story in your mind so you can follow the events. With nonfiction, you just have the former challenge. With fiction, the easiest to follow are those with a single story line and a fable-like storytelling style, like Paolo Cuelho's The Alchemist, which is popular in so many languages, I wouldn't be surprised if it is also doing well in Chinese. 2 Quote
Publius Posted January 27, 2022 at 01:12 PM Report Posted January 27, 2022 at 01:12 PM On 1/27/2022 at 2:16 PM, phills said: I'll see if I can find a Chinese audio track for it. Hmm, found two versions on YouTube. Male voice: https://youtu.be/zERp1IJ0R4U https://youtu.be/y0reERQe6zM Female voice: (AI?) https://youtu.be/jT61CaChw68 https://youtu.be/-krMvR6HwAk 3 Quote
phills Posted January 28, 2022 at 06:45 AM Author Report Posted January 28, 2022 at 06:45 AM On 1/27/2022 at 9:12 PM, Publius said: Hmm, found two versions on YouTube. Thanks! I listened to about an hour so far, and I can understand 60-70 percent of it. Enough not to get lost. I think I'll go through it slowly and use it as a benchmark to see how I'm improving over time BTW, I may be oversensitive, but is the female version an AI voice? It seems soooo even that I'm a bit suspicious. The timing and pronounciation on 语言 is exactly the same 7 times over in this 30 second segment, even though they appear in different parts of sentences. It just struck me as I listened to it. https://youtu.be/jT61CaChw68?t=3132 Quote
Publius Posted January 28, 2022 at 09:40 AM Report Posted January 28, 2022 at 09:40 AM Dunno. Some of her mistakes are easy to explain with machine, e.g. particle 地 as dì, but then she makes some human mistakes like read a character twice. 1 Quote
phills Posted January 28, 2022 at 01:16 PM Author Report Posted January 28, 2022 at 01:16 PM I finally figured out a way to measure my comprehension! I wrote a script that plays subtitles from a file. Except it hides them, unless I press a button. When I press a button, the script reveals 9 characters around when the time I pressed, helping me clear up a short window of audio. So everytime I'm confused or don't understand something, I press the Help Button. The script then counts the number of times I pressed vs the number of lines at the end, and gives me a score. This is 三体 episode 25 on youtube, a 13:22 min clip. https://youtu.be/Tl3f3IGjJeA?list=PLrjIk2pnoftD-qYNsDJyMYTIN19oBpTCV Quote [01:02] 帶上新買回的微裝具 . . . [01:49] . . . 見的已經完全不同, [02:37] 嗯 [02:51] . . . 米高的天文望遠鏡, [04:02] . . . 着飄逸的黑色長袍, [04:28] 在137號文明中, [04:51] . . . 又重新啓動了4次, [05:00] 只走完了石器時代的 . . . [05:21] 但人們一直在努力。 [05:55] . . . 看上去結構很精緻, [07:07] 熄滅了。 [07:46] 唉你可以去查日誌數 . . . [10:26] . . . 使夾層充滿了亮光, [11:05] 那是什麼力量驅使着 . . . [11:56] . . . 代太陽當空熄滅呢? [12:25] 模擬着外界火海對球 . . . Helped: 16 out of 270 (5.92%) According to this, I requested Help on 6% of the lines. That's better than I would have thought -- I would have estimated my understanding at about 85%. (This episode was an easier one, and I've been listening to a lot of 三体 recently. I pretty much understood everything, excepted isolated phrases, and was never lost. ) The test is a bit subjective cause I can choose to press for Help or not, but now I have something to track my progress. Sadly, this is only for content I can find soft subtitles for, so not for Sapiens, yet. 1 Quote
alantin Posted January 28, 2022 at 08:31 PM Report Posted January 28, 2022 at 08:31 PM @phillsWhoa! What did you use to write the script? How do you sync it with the audio? Any chance you could share it? 1 Quote
phills Posted January 29, 2022 at 06:02 AM Author Report Posted January 29, 2022 at 06:02 AM @alantinIt's not as fancy as it seems. First, for videos with soft CC, youtube lets you extract the transcript easily. If you press the "..." button under the video, there's an option to "Open transcript". Once you do that, you can select the transcript and cut and paste it to a file. Make sure "timestamps" is toggled under the 3 vertical dot button on the Transcript window, so you get both the lines & the times. Once you have the transcript file, all the script does is play the subtitles based on the timestamps. Since I never rewind or pause the video, keeping it synced is also easy -- I just start both at the same time. Then it just dutiful hides lines until I ask it to show it, and records number of times for me. The script is a python2 script, running on linux. It's hacked together and unpolished -- so I'm not sure if it'll run on Windows without modifications, because python2 is very finicky about Chinese encoding. It assumes the system can process & show utf-8 encoded Chinese. I'm happy to PM it to you, if you're interested. I also have a "randomize" mode where it randomly hides 2/3rds of the subtitles and shows the rest, slightly delayed. That's an even better mode for learning, I think. (I use it to simulate my previous glancing at the text. Now I don't have to turn my head every 15 seconds). But "interactive" mode is better for testing and showing progress. Quote
phills Posted January 29, 2022 at 06:21 AM Author Report Posted January 29, 2022 at 06:21 AM I listened to 2 more episodes of 三体 so I'd have a better baseline. I decided to go with a slightly more liberal "Help" button policy. Episode 26 -- Shown: 32 out of 235 (13.62%) Episode 27 -- Shown: 36 out of 279 (12.90%) Then I tried an episode of @Woodford's fave youtuber, 李永乐, who fortunately also has soft CC subtitles. He speaks really fast and I've never listened to this before, so I was curious what my test would say. https://www.youtube.com/watch?v=NZcqNE5NgGY Shown: 47 out of 227 (20.70%) 不错! I remember trying to listen to him like 4 months when I first saw him mentioned on this forum, and I got a headache after a few minutes after sampling a video, even with subtitles. I also have a better sense now when I get confused during a video, exactly how confused I actually am. Now I just need to find more Chinese youtube audio with soft CC subtitles! E.g. I've love to get the transcript for Sapiens. Anyone know of a good source? 1 Quote
alantin Posted January 29, 2022 at 06:33 AM Report Posted January 29, 2022 at 06:33 AM On 1/29/2022 at 8:02 AM, phills said: The script is a python2 script, running on linux. It's hacked together and unpolished -- so I'm not sure if it'll run on Windows without modifications, because python2 is very finicky about Chinese encoding. It assumes the system can process & show utf-8 encoded Chinese. I'm happy to PM it to you, if you're interested. That seems like a really good approach! I've mostly done bash and some PS scripts on servers myself, but I should be able to figure out python2 enough to get it running! Can you please send it to me in a PM! ? By the way, I haven't used Windows as my workstation for more than 10-15 years now... ? Quote
phills Posted January 29, 2022 at 07:21 AM Author Report Posted January 29, 2022 at 07:21 AM On 1/29/2022 at 2:33 PM, alantin said: Can you please send it to me in a PM! ? Sent it! Let me know how if it works for you. Quote
alantin Posted January 29, 2022 at 07:49 AM Report Posted January 29, 2022 at 07:49 AM On 1/29/2022 at 9:21 AM, phills said: On 1/29/2022 at 8:33 AM, alantin said: Can you please send it to me in a PM! ? Sent it! Let me know how if it works for you. Thanks! It ran on my mac right out of the box! I tried it with the video linked above about wiping out half of the population of the world, but I guess it's going to need some getting used to before I can use it.. I sort of had it synced at first but then along came some commercial and really screwed it up. There doesn't seem to be a way to show where the script is going other than hitting space and then it's already late to check the time on the youtube video. I never got it synced again despite hitting "space", "a" and "d" like I'm playing guitar hero. ? I'll have to try it again on something else. 1 Quote
phills Posted January 29, 2022 at 10:39 AM Author Report Posted January 29, 2022 at 10:39 AM On 1/29/2022 at 3:49 PM, alantin said: I sort of had it synced at first but then along came some commercial and really screwed it up. Great! But commercials screw it up. I pause the subtitle playback when I see the Commercial in 5.4.3.2.1 message. Then I restart it when the commercial is over. Otherwise you have to Use the Force to try to match them up again The 30 second skip is too much -- three 3 second rewinds (=9 second rewind) usually is the timing of a commercial, assuming you click close commercial after 5 seconds. Also, if you hit escape to quit in the middle, it should save your score up to that point, so at least you didn't waste that time. Trying to rematch them up later of course is more a chore. 1 Quote
phills Posted January 30, 2022 at 05:56 AM Author Report Posted January 30, 2022 at 05:56 AM @alantin You gave me the idea to add a Jump command to jump to a set time. I also added the ability to Start at a set time, and refined the fast-forward/re-wind a bit. This allows me to stop at any time in a video now, and restart at a selected time point. So I tried an episode reading 平凡的世界 that I found, with soft subtitles. The episode has 22 mins. https://www.youtube.com/watch?v=O-q2pBUn6PA Time: 10:10, Helped: 35 out of 179 (19.55%) Time: 04:05, Helped: 12 out of 80 (15.00%) Time: 08:19, Helped: 24 out of 180 (13.33%) Free at last! I'm no longer locked-in to listen to the whole video in one sitting. -------------- The experience listening to 平凡的世界 1 is kinda interesting. I have read most of the chapter once before, when sampling the book. My comprehension definitely improved as I went through the chapter. Part of it is standard warming up to a novel, orienting yourself as to what's going on (plus getting used to the voice). Another part is the later portions have more dialog, which are easier to understand than description / scene setting language. One thing I noticed is that there are 2 levels (at least) of comprehension. When I'm testing myself on listening like this, I zoom in on the words and try to figure out if I can decipher them. This actually makes me slower on orientation than if I just Use the Force and try to holistically understand what's going on. It's similar to what @Publius and @Moshen was saying about fiction v non-fiction. On 1/27/2022 at 7:01 PM, Moshen said: I think it's because with fiction, if there's anything fancy or non-chronological going on, you have to both pay attention to the meaning of the sentences as they flow and also to construct the flow of the story in your mind so you can follow the events. With nonfiction, you just have the former challenge. So even though I had a 19% Help Rate on the first 10 minutes of 平凡的世界 vs a 20% Help Rate on non-fiction 李永乐 video, my understanding of the fiction was much less than my understanding of the 李永乐 video. In the beginning of the story, I could barely figure out what's going on, even though I was deciphering the words. It wasn't until the end of that chapter (14% Help Rate) that I could say I understood it as well as the non-fiction video. 2 Quote
alantin Posted January 30, 2022 at 09:52 AM Report Posted January 30, 2022 at 09:52 AM On 1/30/2022 at 7:56 AM, phills said: @alantin You gave me the idea to add a Jump command to jump to a set time. I also added the ability to Start at a set time, and refined the fast-forward/re-wind a bit. This allows me to stop at any time in a video now, and restart at a selected time point. So I tried an episode reading 平凡的世界 that I found, with soft subtitles. The episode has 22 mins. https://www.youtube.com/watch?v=O-q2pBUn6PA Time: 10:10, Helped: 35 out of 179 (19.55%) Time: 04:05, Helped: 12 out of 80 (15.00%) Time: 08:19, Helped: 24 out of 180 (13.33%) Free at last! I'm no longer locked-in to listen to the whole video in one sitting. Whoa! Awesome! Can you send the updated script to me too? ? 1 Quote
phills Posted January 30, 2022 at 11:36 AM Author Report Posted January 30, 2022 at 11:36 AM Sure, but since I just wrote it last nite/today, it's only been tested once on that 平凡的世界. So it might be even more finnicky than normal. 1 Quote
alantin Posted January 30, 2022 at 12:19 PM Report Posted January 30, 2022 at 12:19 PM @phills, It works well and now I have a way to get back to the right place when the commercial come screwing it up! Though I can't read fast enough to check characters with it while listening to something. Maybe I'll see if I can find a way to get subtitles out of Netflix for it and then I could have it run in the side while watching with subtitles. I could hit space when I get lost and go back to the results after finishing the episode. 1 Quote
phills Posted January 30, 2022 at 12:30 PM Author Report Posted January 30, 2022 at 12:30 PM @alantinThat's partly why I limited to 9 chars window. More than 8-12 chars is too many to read in real time, especially when you're also confused at that time. 李永乐 is just really fast. You might try some slower pods. I've been looking on youtube for things with soft CC. And I found a few -- popular self-help/finance books, podcasts. I'll share when I get a decent sized list. If you figure out a way to extract SRT from netflix, I'd love to know it! 1 Quote
phills Posted January 31, 2022 at 07:29 AM Author Report Posted January 31, 2022 at 07:29 AM To set up a further baseline, I tried sampling 10 minutes of different types of content with soft CC on youtube. A mass-market, non-fiction book I haven't read before, 女人这东西: https://www.youtube.com/watch?v=SvWzHx7JZqw Time: 10:09, Chars: 2415, CPM: 237 Helped: 30 out of 240 (12.50%) A review/discussion podcast on the book, Sapiens (which I've read in English): https://www.youtube.com/watch?v=5VUBvVB0VRU Time: 10:08, Chars: 2867, CPM: 282 Helped: 33 out of 284 (11.62%) A lifestyle podcast, ActNormal, on time management: https://www.youtube.com/watch?v=tzjyuHe1UoI Time: 10:07, Chars: 3559, CPM: 351 Helped: 24 out of 273 (8.79%) A TV drama set in modern times, 一念时光 https://www.youtube.com/watch?v=VP-kGBoPXbU Time: 10:39, Chars: 854, CPM: 80 Helped: 20 out of 151 (13.25%) A TV drama set in historical times, 小女霓裳 https://www.youtube.com/watch?v=pz8HumX7Nas Time: 07:55, Chars: 1043, CPM: 131 Helped: 28 out of 162 (17.28%) -------------------- Observations: 1. It seems like the Help Button is a decent measure for comprehension. At my current ability, I'd say the levels are: 5% = Good comprehension 10% = OK comprehension. Can follow with some loss of nuance. 20% = Borderline non-comprehension. Can keep up with the pace and that's it. 2. Podcasts, where a host talks into a camera, is the most dense form of audio. That time management podcast was going at 351 CPM, and it didn't seem like she was going at breakneck speed. There was just very little dead-time. But as someone else said on one of the listening forum threads, that speed is not hard as it may seem, because podcasters tend to use more simple, conversational language. Also, the speed lets you figure out what's going on by the gist, even if you don't catch the exact word. 3. TV dramas have much less conversational content vs. other forms of audio. A lot of time is taken up by action scenes, music, etc. In a way, it's harder than podcasts, because you have to deal with music & sound effects playing over the audio, actors mumbling, and many different voices and different volume levels. On the other hand, you get the acting, facial expressions & mouth shapes to help you understand. Dramas being stories also require "warm-up". I bet if I listened to another hour or 2 of each series, I'd push them a bit. But from my other experiences with historical TV drama, I think it'd still be at best borderline, without subtitles. 4. I'd like to add one more type of audio to use as baseline, and then I'll hold off testing until next month. Something with heavy 口音. All of these samples have very standard accents. 4 Quote
phills Posted February 8, 2022 at 08:46 PM Author Report Posted February 8, 2022 at 08:46 PM 75 hours in. Progress has slowed to a crawl ? I used to be pleased that I was understanding so much, but now I'm annoyed because I'm noticing all the things I'm not understanding. 1. On the plus side, I cracked Sapiens. I used to barely follow it, but after glancing at the text for Chapter 3, I was able to solidly follow Chap 4 audio. Another plus is I'm able to listen better to narration over music. That used to drive me crazy, now it's bearable. 2. I think I'm comprehending better, but any improvement is not really noticeable. I'm collecting data by testing myself every once in a while on a 三体 episode (only have a few more data points so far). At least the trendline is in the right direction. Will collect until the end of the month, hopefully will add 5-6 more data points to this. I'm not testing myself that often, because I find testing is not that helpful for learning. One tends to overfocus on individual words, rather than trying to follow the flow. I need to go back to free-style listening between tests to maintain ability to understand overall meaning. On the other hand, glancing every 15 secs at the text (while slowly pushing out your limits) is actually quite helpful for learning, although it's too tempting as a crutch. So I'm rotating through the 3 modes. 1 Quote
TofuChris Posted March 4, 2022 at 01:57 AM Report Posted March 4, 2022 at 01:57 AM Keep us updated on your comprehension! I am in a similar boat to when you started (did some very intensive reading study for many months and brought that up to a high level at the expense of listening). Interested to see what you find is most useful and least boring; my biggest issue is that I really enjoy reading but don't really enjoy listening or podcasts because once I lose what's going on, I can never seem to get it back. 2 Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.