Jump to content
Chinese-Forums
  • Sign Up

Lazybug: free, open source app for smart subtitles (OCR)


martindbp

Recommended Posts

nice work. I can especially appreciate the blurring of the hardcoded subs, that's something I have wanted to have on youtube for a while 

I'm a newbie dev and have been playing around with making some language tools for my chinese. I'll have a look and try contributing. first feature that comes to mind is some type of filtered search (e.g by movies)

cheers

Link to comment
Share on other sites

On 3/5/2023 at 2:35 AM, malazann said:

nice work. I can especially appreciate the blurring of the hardcoded subs, that's something I have wanted to have on youtube for a while 

 

Thanks! Yeah the idea is to hide information you don't need for comprehension, since extra information tends to short-circuits your brain. Unfortunately Youtube auto-enables soft subs if they exist nowadays so you have to turn them off manually ? 

 

Quote

I'm a newbie dev and have been playing around with making some language tools for my chinese. I'll have a look and try contributing. first feature that comes to mind is some type of filtered search (e.g by movies)


Sure, help on the front end especially would be very appreciated! It's possible to click the categories to filter and then search but it's not very intuitive. Also thinking movies/tv/other maybe should be different tabs, especially as there may be other content types such as music videos or podcasts. Either way, if you'd like to make a contribution that would indeed be a good place to start. I made a sticky thread in the Discourse forum where we can discuss the details if you'd like.

 

  • Like 3
Link to comment
Share on other sites

This looks really useful! My biggest struggle right now is lowering my dependence on subtitles for TV shows, this seems like a good tool for exactly that. Will give this a good try the following days, for now I have a quick question: how can I hide all but the Hanzi row below the video? The peek buttons seem to have no effect (tried Firefox/Chrome), it's either all rows or none.

  • Like 1
Link to comment
Share on other sites

On 3/5/2023 at 4:43 AM, jannesan said:

Will give this a good try the following days, for now I have a quick question: how can I hide all but the Hanzi row below the video?

I have the same issue. Hanzi  is all I want to see.  汉字字幕。 Anything else detracts from the learning and just gets in the way. 

Link to comment
Share on other sites

On 3/5/2023 at 11:43 AM, jannesan said:

The peek buttons seem to have no effect (tried Firefox/Chrome), it's either all rows or none.


Yikes, you mean if you peek a single hidden word it doesn't work? Perhaps you could shoot me an email with a screenshot and/or a way to reproduce (me@martindbp.com)?


 

Quote

how can I hide all but the Hanzi row below the video?

Quote

I have the same issue. Hanzi  is all I want to see.  汉字字幕。 Anything else detracts from the learning and just gets in the way. 

 

Just the type of feedback I need, thanks! Only been using it myself so you kind of get tunnel vision with my own preferences...

Here's how it was intended to be used: if you're intermediate/advanced, select your HSK level in the options. If you want to hide everything, you can drag the slider all the way to the right to select "all" . This way most/all of the words will be hidden automatically. The first time they will be shown and fade away though, if this is annoying I can add an option to remove that. Then, if you want to pin a whole row, like the hanzi, there's a row button, the row button should peek the whole row if you press it, but there's also a "pin" button in the context menu that pops up. [added screenshot to show how to hide all words and pin the hanzi row]

Unrelated, but since everyone here are probably power users, it's convenient to use the keyboard shortcuts to navigate and peek:

  • left/right arrow: go to previous/next subtitle
  • T: peek full sentence [t]ranslation
  • Y: peek whole pin[y]in row
  • H: peek whole [h]anzi row
  • N: peek whole word tra[n]slation row
  • R: [r]eplay the current subtitle
  • P: [p]eek everything

hideall.png

pinrow.png

Link to comment
Share on other sites

Wait, did you write the segmentation yourself?  It seems pretty accurate.

 

I was a little bit surprised when I saw "Youtube videos can be downloaded, but not others" on the github page.  I use youtube-dl to download videos outside of YouTube, like Bilibili.  It works fine for me.  Maybe you have another reason in mind.

 

By the way, jianwai.youdao.com gives fairly decent speech-to-text, as an alternative to transcribing hard-coded subtitles.  It's limited to 2 hours of video per day (free account).

 

On 3/5/2023 at 6:43 PM, jannesan said:

how can I hide all but the Hanzi row below the video?

 

I also want nothing other than hanzi.  No pinyin.  No English.  No other English.  Just large-font, easily readable hanzi, and for it to be hidden by default (since I'm supposed to be listening, not reading).

 

Also, I'd like for it to count how many times (or in how many videos) I've encountered each word.

Link to comment
Share on other sites

On 3/6/2023 at 7:53 AM, becky82 said:

Wait, did you write the segmentation yourself?  It seems pretty accurate.


Yep! My day job is Computer Vision and ML, so this was really the core of the initial idea. The accuracy depends a bit on the video. Older, low-res videos are a bit of a problem, as well as those where there's white text on white background. Also need to support calligraphy and other rare fonts, as well as traditional characters. I'll write up a blog post on how it works at some point if there's interest!
 

Quote

I was a little bit surprised when I saw "Youtube videos can be downloaded, but not others" on the github page.  I use youtube-dl to download videos outside of YouTube, like Bilibili.  It works fine for me.  Maybe you have another reason in mind.


I should reword, you can use any video, and download with youtube-dl from other sites, but at this point Lazybug only supports Youtube, mainly because Youtube is the only site that I know of that allows video embedding. But once the browser extension is released the goal is to support any site. I'm working on a method of video extraction and processing that will work on any site even if it's not downloadable, although it will be a lot slower (have to let the video play visibly in the browser).

 

On 3/6/2023 at 7:53 AM, becky82 said:

I also want nothing other than hanzi.  No pinyin.  No English.  No other English.  Just large-font, easily readable hanzi, and for it to be hidden by default (since I'm supposed to be listening, not reading).

 

Also, I'd like for it to count how many times (or in how many videos) I've encountered each word.


So do I understand correctly: you want there to be no information displayed at all by default. And when you need the hanzi you want to show the whole hanzi sentence at once?

If that's the case you should be able to do this by selecting your HSK level to max (all) and then click the peek hanzi row button (or press "h"). This might not be convenient enough right now, just trying to understand if this is closer your preferred workflow?

 

Here's just an idea: would it be a useful option that everything is hidden but then shown when the video is paused?

Btw, font size can be changed using the +/- buttons at the top subtitle menu. Adding more fonts is also on my todo list :)

  • Like 1
Link to comment
Share on other sites

On 3/6/2023 at 5:16 PM, martindbp said:

Here's just an idea: would it be a useful option that everything is hidden but then shown when the video is paused?

 

Actually, that sounds perfect!  However, anything other than hanzi is not especially useful at my level (and I'm going to use a browser popup dictionary anyway).

 

I tried enlarging the hanzi, but it enlarges everything, and the things I don't use take up a lot of space. 

 

1906523192_Screenshotfrom2023-03-0618-15-44.thumb.png.350518f5f9d75ab1275cc1a4f5284675.png

 

1760889026_Screenshotfrom2023-03-0618-17-02.thumb.png.28f43a3f512554dffddccd00b3c0c7dc.png

 

 

Link to comment
Share on other sites

On 3/6/2023 at 11:23 AM, becky82 said:

Actually, that sounds perfect!  However, anything other than hanzi is not especially useful at my level (and I'm going to use a browser popup dictionary anyway).

 

I tried enlarging the hanzi, but it enlarges everything, and the things I don't use take up a lot of space. 

 

Got it, sounds like we need a "plain" mode with just pure text so it doesn't interfere with other popup dictionaries as well, shouldn't take too long.

Link to comment
Share on other sites

On 3/6/2023 at 5:17 AM, martindbp said:

Got it, sounds like we need a "plain" mode with just pure text so it doesn't interfere with other popup dictionaries as well, shouldn't take too long.

 

Yes, please. That would work for me. I prefer only Hanzi to be displayed, in an eye-friendly, easily-legible font. I am often watching video in sub-optimal lighting and my visual acuity isn't perfect. 

 

Thank you!

Link to comment
Share on other sites

You can now uncheck the "Use Smart Subtitles" checkbox to get just plain text. The rest works as before, pin it to display it all the time, otherwise you can peek it by clicking or pressing "h". I'll add an option to auto-peek when the video is paused as well soon. Btw, is there any particular font you guys prefer? I'm not sure if the current one is considered "easily-legible" but it seems to be the browser default.
 plainsubtitles.thumb.png.6dbac9fbc986c08f4c066f337531b297.png

Link to comment
Share on other sites

This is an awesome app!  I've been doing something similar manually with a few scripts but this makes it so much easier.

 

There's a lot I'd like to futz with regarding what's shown and what's not, but even out of the box, I can see it's already useful.

 

I'll echo other commenters' preference to only show Hanzi.  I like the smart subtitles features, but I don't want the rows for Pingyin or English or Translations which take up screen real estate.  It'd be nice if there was an X button next to the pin button for each line, to banish that line.  That way the video can take up as much as the screen as possible rather than just the top half.


Edit: Just to illustrate what I mean.  The top picture is the current app, where I've undocked the smart subtitles and pushed it to the bottom edge, to hide away the other lines as much as possible.  But the pingyin (PY) line is still there.

 

threebody.thumb.jpg.6f90a6ed2194ea74490b40719542ae4b.jpg

 

The second picture is what I'd like to be able to do, to X out the pingyin as well, so the video takes up the whole screen.  You see the smart subtitle feature is still on, showing only the word 天后.  E.g.

 

threebodyedited.thumb.jpg.b2c28f702dc5b36c7f2b3e7e45e8bddf.jpg

Link to comment
Share on other sites

One other feature I'd suggest is when I pause, I'd like to be able to rewind back a few lines without rewinding the video. 

 

Sometimes I'm not sure if I understood a word until 5 seconds later, when suddenly nothing after it makes sense anymore.  I went wrong somewhere.  So when I pause, I'd like to scroll back a bit and figure out exactly which word I misheard.  That fits in well with the ability to peek in your app.

 

I also don't like the fade-out feature on subtitle words.  I prefer either a show or no-show.  One anxiety for language learners is that words go by too fast.  Having words fade away just makes that anxiety worse.  I'd rather the gray words not show, and have to either: 1. click on them if I want them to appear, or 2. adjust the filter to let more words through.  I think that better fits with your philosophy of getting rid of unnecessary info.

 

Having to click 3 times on each word is counter-intuitive.  If you click once, it appears in gray; if you click twice, it appears in white.  You have to click 3 times for it to completely cycle through.  This is a side-effect of having blurred & gray & white status for words.

 

Also, the icon pops up over the hidden words you click them, which means you have move your mouse again to see the word after you click.  Three actions -- move to character, click, move-off -- to peek.  Plus, there's a green number on the upper right, plus an icon bar above, which clutters the whole area.  You can't focus on the character, which is the point of the interaction.

 

Ideally, I'd like it so that when you click the blur becomes a word, and when you click again, the word becomes a blur.  With no extra clutter around, and no extra mouse movements.  That way, I can click to peek and unpeek as easily as possible.

Link to comment
Share on other sites

On 3/10/2023 at 12:44 PM, phills said:

I'll echo other commentors preference to only show Hanzi.  I like the smart subtitles features, but I don't want the rows for English or Pingyin or Translations which take up screen real estate. 


Thanks for your feedback! Did you try turning off smart subtitles as per my more recent post above?
 

Quote

It'd be nice if there was an X button next to the pin button for each line, to banish that line.  That way the video can take up as much as the screen as possible rather than just the top half.


I was thinking of implementing it something like this, but since many here are using other popup dictionaries I implemented the mode where the whole the whole hanzi sentence is one element and minimal interactivity, so it doesn't interfere. If there was a way to turn off individual rows for the smart subtitles, would that be useful?

  • Like 1
Link to comment
Share on other sites

On 3/10/2023 at 8:12 PM, martindbp said:

Thanks for your feedback! Did you try turning off smart subtitles as per my more recent post above?

 

I like the smart subtitles though.  If I turn it off, the whole sentence will be blurred.  I like being able to click to peek and unpeek on a word by word basis, and have the software only show me some of the words in each sentence and make me fill out the rest on my own.

 

On 3/10/2023 at 8:12 PM, martindbp said:

If there was a way to turn off individual rows for the smart subtitles, would that be useful?

 

Yes, that would help reduce the clutter.

Link to comment
Share on other sites

On 3/10/2023 at 1:12 PM, phills said:

I'd rather the gray words not show, and have to either: 1. click on them if I want them to appear, or 2. adjust the filter to let more words through.  I think that better fits with your philosophy of getting rid of unnecessary info.

 

Mm, yes, I've come to the same conclusion. I added the fading out when a word is automatically hidden due to HSK level because I'd set my own level to HSK5 although there are still a lot of words there I don't know yet. So then when I see it being faded out I can quickly see what happened and reverse it. But I agree it adds too much cognitive load and confusion to new users... 

 

Quote

Also, the icon pops up over the hidden words you click them, which means you have move your mouse again to see the word after you click.  Three actions -- move to character, click, move-off -- to peek.  Plus, there's a green number on the upper right, plus an icon bar above, which clutters the whole area.  You can't focus on the character, which is the point of the interaction


Good points, it shows I'm not a UI designer ? So the main issue is that you have to move your mouse away to read the character, right? You could just hide the cursor after clicking (and show it again after moving the mouse away), would that help? Also the popup context menu could be hidden until you move the mouse. Those would be fairly quick fixes. Btw the green number means how many times the word shows up in the video which helps to decide whether to try to learn it, but it could also be moved elsewhere, perhaps the popup context menu, so it doesn't clutter.

I will take a look and see if I can refactor the UI a bit this weekend, also adding options for turning off individual rows :)

 

On 3/10/2023 at 2:18 PM, calculatrix said:

Thank you for your work!
Now I am really happy that the weather is so dreadfull. So I will be bingeing  三体 the whole weekend.

❤️ warms my heart! I'm watching through it as well, but "unfortunately" I read the books already in English so the story is spoiled a bit, but if you didn't you're in for a treat!

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...