yaokong Posted March 30, 2022 at 02:17 PM Report Posted March 30, 2022 at 02:17 PM On 3/28/2022 at 3:57 AM, imron said: The code is definitely there for it and it should all be hooked up. Not sure why it's not working. Weirdest thing: PgUpDn does work in the word statistics view and the word list view on linux, as far as I remember it exactly does not work in those 2 views on Windows. If I remember correctly I could only use PgUpDn in the text view on Windows. Quote
MTH123 Posted April 7, 2022 at 08:11 PM Report Posted April 7, 2022 at 08:11 PM As a new member, I thought I should go back through the threads I’ve found particularly helpful and mark posts helpful, thanks, good question or like. But, I just can’t do it in this thread. It’s too darn long, and there are too many helpful posts. It took me hours a day for two days in a row to read through this thread. I just want to say that this software is helpful in so many ways from beginner to expert. It seems to me that it should be selling like hotcakes and imron should be rich. But, what do I know? I was never good at marketing in my geek job. 1 Quote
imron Posted April 7, 2022 at 09:28 PM Author Report Posted April 7, 2022 at 09:28 PM On 4/8/2022 at 6:11 AM, MTH123 said: It seems to me that it should be selling like hotcakes and imron should be rich This is what I think as well unfortunately it is not the case. Quote
MTH123 Posted April 7, 2022 at 10:05 PM Report Posted April 7, 2022 at 10:05 PM On 4/7/2022 at 4:28 PM, imron said: This is what I think as well unfortunately it is not the case. Has anyone ever offered to buy you out or partner with you? Maybe their marketing branch could get your sales way up, even though they take a hefty cut, like 33% or 50%. I was in a situation like this once when I was in my 20s. But, I thought that I and my partner-friends could do it on our own. I’ve often wondered what it would have been like, if I had agreed to partner with people who had far better marketing capabilities. I ended up chalking it up to my karma being to not make money in a fast way. I'm totally okay with the amount of money I have to live on. Fortunately, geeks can earn decent money to live on a slow way. Quote
Flickserve Posted April 7, 2022 at 11:51 PM Report Posted April 7, 2022 at 11:51 PM On 4/8/2022 at 5:28 AM, imron said: unfortunately it is not the case. It doesn't sell like hotcakes but Imron is still rich anyway. ? Quote
imron Posted April 8, 2022 at 11:16 AM Author Report Posted April 8, 2022 at 11:16 AM On 4/8/2022 at 8:05 AM, MTH123 said: geeks can earn decent money to live on a slow way. For sure. I earn a decent amount, just not from CTA (or any of my other bits of software). If it brought in more I’d be able to justify spending more time on it. I’ve had a couple of discussions with people about partnerships, but nothing ever came of them. On 4/8/2022 at 9:51 AM, Flickserve said: Imron is still rich anyway In non-monetary terms, yes. Quote
New Members valkow Posted April 10, 2022 at 05:40 AM New Members Report Posted April 10, 2022 at 05:40 AM Thank you for this tool! I'm not ready to tackle my first book, but I'm going to start loading the important words for it into my anki deck now. Like this I can count down until I'm ready to go for it. One question: When I add a custom word, does it remember it across files? The book I'm going to read is in a series, so it would be nice if it remembered the names. Quote
imron Posted April 11, 2022 at 12:26 AM Author Report Posted April 11, 2022 at 12:26 AM Yes, it will remember it across files. 1 Quote
MTH123 Posted April 21, 2022 at 12:34 AM Report Posted April 21, 2022 at 12:34 AM I’ve been toying with the idea of putting a text document through the Stanford Word Segmenter (https://nlp.stanford.edu/software/segmenter.html) and then putting the result through Chinese Text Analyser to maximize Chinese Text Analyser’s capability of generating a list of words with frequencies. Does anyone have any thoughts on this? Quote
imron Posted April 21, 2022 at 01:10 AM Author Report Posted April 21, 2022 at 01:10 AM It will likely lead to better segmentation from cta, because cta starts and stops its segmentation algorithm each time it hits spaces or punctuation. If every word is separated by a space, then CTA will have a higher likelihood of using the correct segmentation - but for longer words, or things like names it might still incorrectly segment within the word. I've actually got a development build of CTA that segments entirely based on spacing, and if passed through the SWS first, would give an exact match on the segmented results. That being said, one of the reasons I've never gotten around to improving the CTA segmentation is that for the most part, the frequency lists are accurate enough to be useful. Things like names are likely be better detected going with the SWS approach, however for most other words in a text the most frequent words will still be at the top and the least frequent words will still be at the bottom, it's just that the ordering may be slightly different. When generating a frequency list from a text, the difference is not going to be significant enough to worry about as you'll still get more high-frequency words than you'll be able to deal with. I'd be interested in seeing a comparison though if you do it ? Contact me via email and I can set you up with the development version of CTA with space segmentation. 1 Quote
MTH123 Posted April 21, 2022 at 01:15 AM Report Posted April 21, 2022 at 01:15 AM On 4/20/2022 at 8:10 PM, imron said: Contact me via email and I can set you up with the development version of CTA with space segmentation. Thank you so much for your post! But, whoa, most of what you said is too advanced for me. Please give me time to digest it. Quote
MTH123 Posted April 21, 2022 at 01:25 AM Report Posted April 21, 2022 at 01:25 AM On 4/20/2022 at 8:10 PM, imron said: Things like names are likely be better detected going with the SWS approach I've found that Google Translate is very good at translating names. So, it's one of the reasons why I use it when I'm translating Chinese subtitles. Quote
MTH123 Posted May 8, 2022 at 06:40 PM Report Posted May 8, 2022 at 06:40 PM On 4/20/2022 at 8:10 PM, imron said: I'd be interested in seeing a comparison though if you do it ? Contact me via email and I can set you up with the development version of CTA with space segmentation. Thank you again for the development version of CTA with space segmentation! I have made a comparison involving SWS. I’ve placed the post in a new thread and provided a link here, as you suggested: https://www.chinese-forums.com/forums/topic/62192-chinese-text-analyser-and-stanford-word-segmenter/ 1 Quote
Jan Finster Posted February 13, 2023 at 03:58 PM Report Posted February 13, 2023 at 03:58 PM @Imron: This is a trivial matter, but lately CTA always opens with a minimized window (see pic; blown up pic). I close it after maximizing the window, but it would always open minimised. Any fix? Quote
imron Posted February 13, 2023 at 09:55 PM Author Report Posted February 13, 2023 at 09:55 PM What OS + version? Quote
Jan Finster Posted February 14, 2023 at 05:23 AM Report Posted February 14, 2023 at 05:23 AM On 2/13/2023 at 10:55 PM, imron said: What OS + version? Windows 10 CTA 0.99.18 Quote
yaokong Posted April 28, 2023 at 07:42 AM Report Posted April 28, 2023 at 07:42 AM @imron is there a way to export a text file with all the words marked as known? Or could it be as easy as creating a new wordlist based on an existing one, and than that new wordlist file exactly contains only the known words (and not the entire history of added and removed words). Quote
zenzero Posted May 5, 2023 at 08:16 AM Report Posted May 5, 2023 at 08:16 AM I've somehow missed this in the past, but will have a look-in later on tonight. Just to confirm, does the MacOS version work with Apple Silicon/M1+ chips? Quote
yaokong Posted May 5, 2023 at 03:44 PM Report Posted May 5, 2023 at 03:44 PM On 5/5/2023 at 4:16 PM, zenzero said: does the MacOS version work with Apple Silicon it is an Intel app, but it runs just fine through the Rosetta 2 layer. 1 Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.