Jump to content
Chinese-Forums
  • Sign Up

Introducing Chinese Text Analyser


Recommended Posts

Posted

@imron

It would be fine to add also a function that can mark characters from HSK1, 2, 3, 4, 5, 6, as it's realized with words. And also statistics is important, in particular for unique characters from these sets. It's very helpful that CTA aready shows the percent of the unique characters, though for all of them only. In other words, it would be very desirable to have everything as for words. I could invest US$ 1,000 in that for the benifit of all, without any special rights for myself. 

  • Like 1
Posted

@Jan Finster

I fully agree with you that characters are important as they are, too, not only as parts of words. I have an idea that learning characters in some sufficient number in advance, say, the HSK5 set, would let to advance more rapidly later when one will start to read HSK5 texts and other materials. Please, look through my four element system to learn characters at https://www.chinese-forums.com/forums/topic/59580-four-element-system-to-learn-characters-their-pinyin-and-meaning/, it would be very interesting to know your opinion. 

  • Like 1
  • 3 weeks later...
Posted

Imron, how do you page up/down in CTA on MacOS? Pressing Fn + up/down arrow keys does not work. It seems you can only scroll text a few lines at a time.

Posted

There does not appear to be a way to do it, which is an oversight because the feature is there on other platforms so it seems I forgot to hook up the keypress on mac.  It'll go in to the next release.

  • Thanks 1
Posted

Imron, how do I add new words on the Mac OS version of CTA? I cannot get the pop-up menu to appear using the touchpad, even when holding down the control key.  (I can with a right-clickable mouse though.) 

Posted

Oh I see. I need to click the touchpad with two fingers simultaneously (or set up a different option in Mac OS System Preferences).

 

Shouldn’t control + click still work, though?

Posted

It should still work, but it doesn't (I'll add that to the list of things todo).

Posted

The full list is quite long. The list for the next release is much shorter, and involves quick easy things like this. 

  • 2 weeks later...
Posted

@imron

It would be especially great if CTA could also mark and counter 'head' characters even though they're of conditional nature. They can be considered as 'names' of syllables like in old English and other old languages letters had names, eg. Thorn (þ), Wynn (ƿ), Ash (æ), Yogh (ȝ). And it's an effective way to organize a set of characters in one's mind after all, why not to use it?
The concept of 'head characters' I described here, particularly in the last self-reply
https://www.chinese-forums.com/forums/topic/59743-meaningphonetics-based-system-to-learn-characters-adopted-for-english-speakers/

Posted

Forgive me if this has been discussed, but I'm new here and there are 30 pages.

 

Been playing around with CTA and I'm finding it useful to scroll through unknown word lists and pick out single-character words, as very often they're part of a name or, almost never, a segmentation error, or a new or rare word. Two things that would make this easier...

 

1) A sort by length view in the word window.

2) When looking for those characters, a way to only find them if they're NOT part of an existing word. So if I can see there are 10 instances of 诊 segmented as a single character in the word view, I don't need to click through all the correctly segmented 诊所 to find the 诊间。That is, a find function which ignores matches which are already part of a larger word, I suppose.

 

Do these things exist? Should they? 

Posted
On 2/24/2020 at 9:51 AM, Jan Finster said:
On 2/24/2020 at 9:49 AM, imron said:

it would be possible to write a Lua script that built a list of known characters from CTAs list of known words, and then use that to build a list of characters in a document that were not on that list.  If that's something you would really like, I can probably write a quick script to do it.

 

I would love it if you or anyone else capable of writing such a script could do this ?

Bump ?

 

  • Like 1
Posted
21 hours ago, Jan Finster said:

Bump ?

Here you go.

 

Download that file somewhere to your computer, and then open it from within the Run Script dialog (Tools->Run Script).  Feel free to ask follow up questions in the linked thread.

  • Helpful 1
Posted

New release is up, it includes:

  • Updated CC-CEDICT to use the latest version
  • Added word length column to the wordlist view
  • Fixed bug where words that spanned a line weren't highlighting correctly
  • macOS: Added support for standard PageUp and PageDown keyboard shortcuts with Fn Up and Fn Down
  • macOS: Added support for system-wide dark mode
  • macOS: Ctrl-left click will now display the popup menu in the textview
  • macOS: Fixed bug where dictionary definitions weren't showing in dark mode

@roddy you can now sort by word length using the word length column (hidden all the way on the right so you may need to scroll).  Searching for individual characters not in words is going to be a bit trickier than I expected so didn't make it in to this release.

  • Like 3
Posted
On 2/25/2020 at 3:36 AM, Pall said:

It would be fine to add also a function that can mark characters from HSK1, 2, 3, 4, 5, 6

I get that people are interested in things like this, but CTA aims to subtly push people away from thinking in terms of HSK.  In fact I only provide the HSK statistics to drive home the point that for most native content, the vocabulary for the HSK doesn't give you very much at all, and you're better off using frequently occurring words in what you are reading.

 

On 3/25/2020 at 7:24 AM, Pall said:

It would be especially great if CTA could also mark and counter 'head' characters even though they're of conditional nature.

I had a look through your link, but I'm still not entirely sure what you mean by head characters.

Posted

All together now:
 

Did you ever know that you're my hero
And everything I would like to be?
I can fly higher than an eagle
For you are Imron beneath my wings

  • Like 3

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...