Jump to content
Chinese-Forums
  • Sign Up

ParseRime (析韻) LMC Input Method


Recommended Posts

Posted

spacer.png

PRime (ParseRime IME), or 析韻輸入法, is a diaphonemic pan-topolectal input method based on the Yunjing (韻鏡), the oldest known rime table reflecting the literary language of the Tang-Song period (Late Middle Chinese). Naturally, considering the evolution of the modern topolects, varying degrees of training and experience are required to master this input method depending upon the number of topolects with which a given user is acquainted. As Middle Chinese is the greatest common factor of most Chinese topolects and Sino-xenic lexicons (primarily in the literary register), it is the natural choice of foundation for a universal diaphonemic input method. When difficulties in syllable construction arise, the information required to construct a given character can be found in a number of Chinese and Sino-xenic resources. This input method may be seen as a diaphonemic analogy to graphical input methods like Cangjie, Wubi, and Dayi, all of which also embody a pan-topolectal approach. PRime boasts the highest degree of accuracy–with the lowest number of keystrokes–of any Chinese phonetic input method, as the diaphonemic description is so narrow that the character-candidate lists are kept remarkably short; within those lists, commonly used characters represent only a small fraction of the total number of characters available. The project began on the 30th of May, 2014.

 

The typist compiles a string of four Middle Chinese syllabic properties to construct each desired character: initial (聲母x36), division/rounding (等呼x4), tone (聲調x4), and rime-group (韻攝x20), collectively termed 析韻根字 or 母呼聲攝 (with a nicely round figure of sixty-four symbols in total). The extended rime-group set of PRime contains four additional members outside of the classical set in order to fully distinguish divisions I (一等) and II (二等) within the 'outer' (外轉) rime-groups. As divisions III and IV have been merged in this system–with the distinction between divisions I and II already accounted for–the remaining four members of the 'division/rounding' category will henceforth be referred to as 'medials'. The keyboard itself is arranged in two layers, each containing a chart of syllabic properties. The Initial layer (母層)–as the name implies–carries the initials, whilst the Final layer (韻層) carries the medials, tones, and rime-groups (collectively comprising the finals). On the Initial layer, each column represents a classical initial-group roughly corresponding to the place of articulation (聲系), whilst each row indicates a voicing/aspiration status roughly corresponding to the manner of articulation (清濁). On the Final layer, each column represents a coda (韻尾), whilst each row represents a nucleus (韻腹); naturally, this arrangement does not apply to the medial (等呼/韻頭) and tone (聲調) keys, which are placed in the central region of the keyboard. Once a key on the Initial layer is pressed, the keyboard immediately shifts to the Final layer; once a rime-group key is pressed (following the optional selections of medial and tone), the resulting character is entered automatically, and the keyboard reverts to the Initial layer, allowing the process to begin a new cycle for the next desired character. When no medial or tone is specified, open/unrounded (開呼) and even/level/flat (平聲) serve as their respective default values, as these values are held by the greatest number of characters in the Chinese lexicon; this arrangement allows for as few as two keystrokes per character and no more than four (not including the SHIFT key, which reveals the ten 全濁 voiced initials). There is also a 'phonetic component' mode, which allows the user to output the desired character by entering the value of its phonetic component in citation form (reflecting the component's pronunciation when used independently).

 

PRime_keyboard.thumb.png.f48c3bccfb7ee401be58aba2879c9606.png

 

 

  • Like 2
Posted

This is a brilliant idea!

 

If it is possible to get working on ibus for Linux (I don't see why not, you just need to set up tables to work with ibus-table or something), I would definitely try this out!

 

I am currently a happy Cangjie user so I don't really need a new IME for productivity purposes, but I think this input method sounds like a fun pedagogical exercise—what better way to remember middle chinese rimes in your head?

  • Like 2
Posted

Takeshi, thank you for the kind words.

 

Currently, I am programming this IME using OpenVanilla, which is a Macintosh-exclusive application (and a frustratingly limited one, at that). However, once I either brush up on my programming chops or find a talented software engineer with whom I may collaborate (whichever comes first), this system can be easily tailored for cross-platform implementation.

 

Obviously, my initial post does not offer an exhaustive tutorial for PRime, but one such document is currently in the works, and will be minted once the program itself is fully operational and ready for a proper release. Aside from the programming aspect, another crucial step in development is to locate a suitable Late Middle Chinese source of diaphonemic descriptions for each desired character within the massive Chinese lexicon–in digital form. So far, one of my friends was generous enough to provide me with such a source for an Early-Late Middle Chinese lexicon; many alterations will be required to convert the diaphonemic values into their appropriate LMC forms. I am also using a copy of the National Pronunciation Character Dictionary (校改國音字典) from 1921 as a primary reference–since it offers LMC values for each entry in the desired format (母呼聲韻)–but unfortunately my copy is a PDF file of scanned pages.

Posted

If you need an example of how to write an IME for Windows, Pinyinput (written by me in C++) is open source.  I'm too busy with other projects at the moment to be able to collaborate on anything, but feel free to ask questions about the source code if you need.

Posted

Awesome.

 

I was going to say I'd like one for EMC, but then I saw this.

  • Like 1
Posted

Yea, the most difficult thing to do is get a digital form of all of the phonemic descriptions in the format you want for the characters. Actually I think if you have that, then porting the input method to any other framework would be trivial.

 

I can't really help you with the digital form description thing, but I wish you best of luck!

Posted

@imron

 

This is extremely helpful, thank you! Fortunately, I have a bit of experience in C'89 under my belt, though I had focused primarily on digital audio applications in the past. I will likely have a fair number of questions for you in the near future, so thank you for offering your assistance.

 

---

 

@Hofmann

 

Polyhedron's system seems to be the best when it comes to typing in EMC, though I am not too keen on a few aspects of his proprietary Pinyin scheme (namely assigning several of the voiced fricative initials to [zs] and its derivatives).

 

---

 

It has recently come to my attention that a Japanese input method editor named 'PRIME' (predictive IME) has already been released on Ubuntu:

https://apps.ubuntu.com/cat/applications/prime-dict/

 

I will continue to refer to my system as PRime (or as 析韻輸入法, its official Chinese name) until I figure out how to deal with this situation. Any suggestions are welcome, of course.

Posted

My proposal for a Romanized PRime/Orthographic LMC (which I use among friends for fun) as a PRime teaching tool. The tones and initials have been discussed elsewhere.

//iɛ// applies to <ia>, <iau>, <ian>, etc. <iua> is similarly //yɛ//.

Screen-sized version here

iXuqz.png

 

Edit actually "wi" seems to be 撮? I'll have to edit it to the proper column. Thinking about whether I should change it to iui... Likely not.

Edit needed for "i"

Posted

Btw, suppose you're catering to users of both simplified and traditional characters, I suggest having some sort of toggle button for full simplified or full traditional input, instead of the way the input I'm using does it: listing them as candidate characters, since you wanted something like canjie: ideally just one result with a combination of keystrokes. Unless you don't plan to cater to simplified characters here.

Edit: unlike the caps lock switching between Chinese and English, it could be set with the keyboard layout or something, as in, not a key that toggles between trad and simp, but a setting.

 

Also, since you have been saying stuff like users could use ZDic to check (or at least ZDic is one of your sources), and ZDic is more faifhful to EMC, perhaps you'll have to pay attention to the Man -uang + Canto -ong series. Weren't they -iâng in EMC and -âng in LMC?

 

Edit: Actually, for a Romanized-PRime more suited to the input method, perhaps it'd be better to just have the four 呼 as --, i, w/u, iu and then name the shiep's by their 開 names. That way it'd save the possible inconsistencies -ei, -i and -wi, which would be reserved for ortho LMC. So, they become -iai, -i/-ii and -iui maybe, unless 撮 only marks the presence of both u and i, in which case -wi shouldn't be too much of a problem, although it'd mislead people into thinking it's from 合.

  • 2 weeks later...
Posted

I shall begin work on the official IME software (in ANSI C) shortly. In the meantime, here is a prototype that I've since withheld until developed to a satisfactory–albeit imperfect–state:


(see original post)

This program is far from perfect, but I feel that it provides an adequate starting point for introductory purposes. Updates are conducted on a continuing basis, and major ones shall be announced and shared in this thread. The latest versions of the PRime instruction manual and character dictionary have been appended to the first post of this thread.

  • Like 1
  • 1 year later...
Posted

I have finally crafted a standard Romanisation for PRime (析韻拼音). Please be advised that it is not pretty, but this is because the system is purely diaphonemic, and is not intended to represent any specific phonetic values. This Romanisation adopts all 26 letters of the English alphabet alongside the Arabic numerals 1 through 4 (for tones) to maximise compatibility (especially within programming environments). That being said, the letter assignments are not arbitrary; they are all based on broad phonetic resemblances and also represent an amalgamation of several established Chinese orthographies. Whilst the standard version of PRime defines each character with exactly four phonemic symbols, the Romanised version of PRime may define a character with as few as half the number of symbols, or with as many as twice the number of symbols.

 

INITIALS

幫P   滂PH  並B   明M
非F   敷FH  奉V   微MV
端T   透TH  定D   泥N
知TR  徹THR 澄DR  娘NR
見K   溪KH  群G   疑NG
精TS  清C   從DZ  心S   邪Z
照TSR 穿CR  床DZR 審SR  禪ZR
影Q   曉H   匣X   喻J
來L   日NJ

 

MEDIALS

開(無) 齊Y   合W   撮WY

 

TONES
平1   A    AI   AU   AM   AN   ANG   AUNG

上2   AR   AII  AUU  AMM  ANN  ANGG  AUNGG
去3   AA   AAI  AAU  AAM  AAN  AANG  AAUNG
入4                  AP   AT   AK    AUK

 

FINALS

果O   亥OI  豪OU  覃OM  寒ON  宕ONG
假A   蟹AI  效AU  咸AM  山AN  梗ANG 江AUNG

遇E   止EI  流EU  深EM  臻EN  曾ENG 通EUNG

  • 8 months later...
Posted

I thought it would be fun to make a 'symbol' version of PRime (with Katakana, Zhuyin, and standard characters) to use for annotation (析韻符號). Amazingly, I got it down to exactly 100 symbols (always three symbols per syllable: Initial + Final + Tone).

 

(a guide is attached to the original post)

 

Yes, Unicode compatible:

 

ㄅㄆトㄇㄈキク万ㄉㄊ

大ㄋチㄔモㄑㄍㄎコ兀

ㄗㄘオㄙタㄓ屮ㄐㄕ丄
レㄏㄒイㄌㄖㄛカホ仌

ㄚㄝワヨㄜㄩㄨウㄞネ

ハナ土ヱ巾ヒヰニリㄠ

アㄡユロミㄢフラ干九

山テ乙スㄣ人寸セ几丈

ㄤ广ヤマヲルㄥ廾么弋

巳工ケ夂サー丨ノㄟゝ

 

(ー is used in horizontal text; 丨 is used in vertical text)

 

e.g. 中華(チケーㄒワー)

  • 4 months later...
  • 5 months later...
Posted

I thought about providing the option to place the backspace/delete key in the upper-right corner rather than the lower-right corner of the mobile keyboard. This allows the grades (等第) of the finals (韻母) to line up correctly: 一三等 along the top row (外轉), 二三四等 along the middle row (外轉), and 一三等 along the bottom row (內轉). While most mobile inputs have the backspace/delete key in the lower-right corner, the mobile handwriting-recognition input method places it in the upper-right corner, so this move wouldn't be unprecedented (not to mention the fact that standard computer keyboards place it in the upper-right corner as well).

 

What do y'all think?

 

(please refer to original post)

  • 8 months later...
Posted

Only three taps (initial/final/tone) per character! Occupies the same area as normal keyboards with keys of the same size! All the initials and finals line up perfectly logically too for easy memory and location! Each of the 100 symbols are user-assignable as long as the replacements are Unicode-compatible! See the newly appended images and the 'symbols & pronunciation' file from my original post for more details!

  • Like 1
  • 4 years later...

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...