Jump to content
Chinese-Forums
  • Sign Up

Outlier Linguistic Solutions


Recommended Posts

Posted

Outlier has made great improvements recently with so many new characters, but sometimes I miss more 甲骨文,篆文, etc. for instance you explain that 教 has a 爻 component but the entry doesn't show any picture, so we only can guess where it should appear. Are there any plans in the near future for updating those characters? Also I bought the expert edition but many times I omit that part because it's almost hidden in a link at the end of the entry, but on the other hand you show the stroke order which for me is wasted space. Could it be possible to show the expert information (when available) under the basic explanation?

Posted
16 hours ago, Dlezcano said:

Could it be possible to show the expert information (when available) under the basic explanation?

 

That's on our to-do list, but it would be dicey at the moment because dictionary definitions are rendered as simple rich text views rather than HTML - we have to pop up a separate screen to show complicated HTML content like that in Outlier notes.

  • Like 1
Posted
On 1/21/2021 at 6:55 AM, Dlezcano said:

Are there any plans in the near future for updating those characters?

The Expert Edition will contain ancient/original forms (at least one, usually more) for all entries. We're basically adding them as we create the Expert entries.

  • Like 1
  • 3 months later...
Posted

I honestly might have asked this in this thread already, I cannot remember, but after backing the first project on kickstarter I backed the Japanese Kanji version. Given that the Kanji version had some temporary measures to get the content out, first Pleco, and then KanjiStudey version, before ending up in Yomiwa, I'd really like to see both the Chinese and Japanese Outlier dictionaries appear in Wenlin. Would licensing allow for this? What do we need to do to make it happen? :D

  • 1 month later...
Posted

Hi all,

 

We just released a huge update to our character dictionary for Pleco.

 

This update will be included in both the Essentials and Expert Editions of the dictionary (but not Mini).

 

The "System-level Info" adds a ton of useful information about how Chinese characters work on the system level. This makes the sound and meaning connections between characters super clear.

 

For any component, or character that acts as a component in other characters, you can see all the characters it shows up in, broken down by function.

 

For example, you can see every character containing 生, whether it's a sound component, semantic component, or empty component.

 

We also show you every character it shows up in as a radical—you'll see very clearly that "radicals" and "semantic components" are NOT the same thing. For example:

 

Quote

This is the semantic series for 生.

Meaning tree for the component:

 

  (orig.) to grow, to emerge

 

* 產 chǎn: [(orig.) to grow]; give birth, bring forth, produce

* 姓 xìng: [(orig.) sign of belonging to a clan (i.e., a clan name)]; one's family name; clan, people

 

[...]

 

This is KangXi radical 100. Radical list (2 members):

* 產 chǎn: give birth, bring forth, produce

* 甥 shēng: sister's child

 

生 is the radical for 甥, but it isn't a semantic component—it's a sound component in 甥.

 

Note that the semantic series have the original meanings in red text, since the focus is on meaning. The sound series (below) have the pronunciations in red. This makes it really easy to see all the important information at a glance.

 

You'll also see the range of pronunciations a sound component can give. For example, the following characters contain 生 as a sound component:

 

Quote

This is the sound series for 生 shēng.

  • xìng: nature, character; sex
  • xīng: a star, planet; any point of light
  • xìng: one's family name; clan, people
  • shēng: sacrificial animal; animal
  • shēng: sister's child
  • shēng: small gourd-shaped musical instrument

 

For characters with seemingly weird sound components, this info can be super useful. For example, how can we say that is 各 a sound component in 路, when gè and lù seem so far apart?

 

Quote

 

This is the sound series for 各 .

  • : road, path, street; journey
  • lüè: approximately, roughly; outline
  • luò: river in Shanxi province; city
  • luò: a white horse with black mane; a camel
  • lào (luò): brand, burn; branding iron
  • lào (luò): cream, cheese; koumiss
  • luò (lào): enmesh, wrap around; web, net
  • : bribe; give present
  • : pattern, standard, form; style
  • : armpit, arms
  • : bone; skeleton; corpse
  • : chamber, pavilion; cabinet
  • : guest, traveller; customer
  • jiù (gāo): fault, defect; error, mistake

 

 

Check out this video for more info: https://www.youtube.com/watch?v=DCgSOaLCy6U

 

If you don't have the dictionary yet, you can get it here: https://www.outlier-linguistics.com/products/outlier-dictionary-of-chinese-characters

 

Use the discount code 'systemdata' at checkout for 20% off the dictionary and/or anything else in the store!

On 5/1/2021 at 5:47 PM, NinKenDo said:

I'd really like to see both the Chinese and Japanese Outlier dictionaries appear in Wenlin. Would licensing allow for this? What do we need to do to make it happen?

 

 

It's definitely possible. :)

  • Like 4
Posted

@OneEye Finally, looks fantastic!

 

Two remarks, could also be related to Pleco @mikelove

1. When I look up 好 and click on 女 in the outlier decomposition, it will look up 女子, for 如 it will look up 女口 :D  That's probably not intended...

2. Will the characters in the system level data also be clickable in the future? I'm kind of used be able to look up any character in Pleco directly.

Posted
8 hours ago, wibr said:

1. When I look up 好 and click on 女 in the outlier decomposition, it will look up 女子, for 如 it will look up 女口 :D  That's probably not intended...

 

Was this also true in the previous version? We didn't really change any part of the data files in this new one except to add links to the system data.

 

8 hours ago, wibr said:

2. Will the characters in the system level data also be clickable in the future? I'm kind of used be able to look up any character in Pleco directly.

 

Yes - the basic problem there is that we currently do Outlier notes in HTML in a popup web browser window because a lot of them involve complicated features our own text rendering system doesn't currently support (tables, e.g.).

 

It's not at all difficult to add the ability to mix HTML / non-HTML notes in the same dictionary - just not something we had any reason to implement before now - and in fact we're planning to do that in our next minor bug-fix update, but even after we release that update we'll have to wait another month or so before it's widely distributed enough that we could release a new version of Outlier that relied upon it, so since we didn't want to make everybody wait that long for their system data we put it out this way for now. But we'll probably have that done by the time Outlier's next entry update is ready.

  • Helpful 1
Posted

Thanks, sounds good!

 

Regarding 1., I've never noticed it before but that could also be because I've never tried those specific cases. Maybe someone can try to reproduce it before updating the dictionary...

 

 

Posted

When it comes to original meanings, does the Outlier dictionary make it clear weather or not this original meaning has died out or not? It would be nice to have the † symbol or similar, like in the Oxford English Dictionary.

  • Good question! 1
Posted

Yes, we do plan to add something like that, or perhaps even an indication of frequency in modern Chinese for each meaning, in a future update.

  • Like 1
Posted
5 hours ago, OneEye said:

Yes, we do plan to add something like that, or perhaps even an indication of frequency in modern Chinese for each meaning, in a future update.

 

That's excellent.

 

Is there a list of all characters currently added to the traditional version? Since I'm practicing writing in Anki it would be nice to prioritize the ones added already, since every entry information really helps the recall.

 

Also, since I know you're one of the etymology experts on here, for rare characters, which would be the best online resource to find reliable etymology and/or component breakdown information? Chinese sites included.

Posted

Hi John 

 

Firstly, fantastic update, great to see Outlier consistently pushing out great content, thanks for the hard work.

 

One thing I feel like is missing in the update is a specific listing for the old Chinese that has caused characters to show variance in their sound series. The example of 'klaag' in your video is really interesting, both helpful for academics and for language learners forming memory paths. Any way we can get a header in the sound series for 'old chinese pronounciation'? I'm still switching back and forth from Kroll for some help, but thats often not enough (eg. his Middle Chinese for 各 is 'kak', so still doesn't really help tie together 'l-' and 'ge' pronounciations)

  • Like 1
Posted

We're planning to add something we've been calling the "sound formulas," which aren't OC reconstructions of course, but do show the range of sounds a given sound component can represent. I believe those will appear in the system data, for both Essentials and Expert.

 

As for OC reconstruction, we may be able to do that, but it would have to be an Expert Edition feature (because they're not "essential" for learners). We've even talked about doing a "scholar edition" that has much more complete info than even what we have for the Expert Edition, but that would come later, of course. If we were to do that, I think we'd want to include several scholars' reconstructions—at least Baxter 1992, Baxter-Sagart, 鄭張尚芳, and possibly 李方桂, 潘悟云, 王力, and Karlgren for completeness.

  • Like 3
Posted
On 6/26/2021 at 6:37 AM, OneEye said:

"scholar edition" that has much more complete info than even what we have for the Expert Edition

I find this a bit jarring, as wasn't the expert edition specifically for use in academic (among other) environments? I bought the expert edition specifically for the scholarly references (particularly the 李旭昇 references which abound, for obvious reason). Reference to Baxter, Karlgren (maybe even some Pulleyblank for contextualisation) etc would seem to fit in the same bracket. Its something I would expect in an expert version of the dictionary, as you state at the beginning of your comment, and would argue it shouldn't be separated into (another) iteration of the dictionary.

 

  • Like 1
Posted
12 hours ago, Tomsima said:

wasn't the expert edition specifically for use in academic (among other) environments?

 

We never said that, no. I believe the closest thing we've said is that academics could use it as a jumping-off point for further research. Both the Essentials and Expert Editions have both always been intended for learners, first and foremost. The Expert entries are highly curated and don't present the whole picture, by any means. That being said, for academics not working in paleography, excavated texts, etc., the Expert Edition may be enough.

 

If we were to do a scholars edition (it probably wouldn't be called that), it would be intended as a research tool, not a pedagogical one. It would contain tons of raw data: as many ancient forms as we can possibly get our hands on, organized by time period, medium, etc. and with all necessary citation info (vessel name/集成 numbers, maybe even 上下文, etc.) rather than just the curated set in the Expert entries, as much OC and MC info as possible, and so on. It would likely be an online database rather than a Pleco add-on, as it would likely be several gigabytes as an add-on (I think the Expert Edition is already the largest Pleco add-on, due to the number of SVG files included).

 

Note that I'm not committing to us doing this, and that if we did, it would likely be several years in the future. Nor am I saying we won't include some OC info in the Expert Edition. We probably will. But we intend for the Expert Edition to remain a tool for learners, which requires that we curate the info we include and guide people through it, as the vast majority of learners won't be able to make sense of, for example, multiple OC reconstructions.

Posted
36 minutes ago, OneEye said:

It would likely be an online database rather than a Pleco add-on, as it would likely be several gigabytes as an add-on (I think the Expert Edition is already the largest Pleco add-on, due to the number of SVG files included).

 

汉语大词典 is bigger, actually, though that's mostly because our full-text Chinese index is not very space-efficient; at the same time, Expert would be a lot smaller if we weren't turning those SVGs into PNGs (for lack of a satisfactory universal cross-platform SVG rendering solution). We've made those indexes smaller for 4.0 - some of that extra space went to other purposes, like indexing example sentence pinyin, but absent those feature additions, databases are about 50% smaller now.

 

We could certainly accommodate a multi-GB database - we're already tentatively planning to do that with Chinese Wikipedia once we have the spare time to write a converter script - but yeah, something of this scale certainly might make more sense as an online database, particularly if you plan to update it frequently.

  • Like 1
Posted
4 hours ago, mikelove said:

for lack of a satisfactory universal cross-platform SVG rendering solution

I've used nanosvg for parsing SVGs, which is a single-header C/C++  lib that returns a list of paths, and then using native rendering to display the paths.  It probably only handles simpler SVGs but maybe Chinese characters meet that definition?

Posted
6 hours ago, imron said:

I've used nanosvg for parsing SVGs, which is a single-header C/C++  lib that returns a list of paths, and then using native rendering to display the paths.  It probably only handles simpler SVGs but maybe Chinese characters meet that definition?

 

Will give it a try, thanks! Looks like Telegram uses it in their iOS app + has a nice convenient iOS wrapper around it through I'm not finding an equivalent on Android so we'd probably have to roll that ourselves (not a big deal though).

  • 2 weeks later...
Posted
On 6/24/2021 at 6:06 AM, OneEye said:

Note that the semantic series have the original meanings in red text, since the focus is on meaning. The sound series (below) have the pronunciations in red. This makes it really easy to see all the important information at a glance.

 

Small suggestion — since Pinyin rendered in colors often indicates tone ( ma), would you consider changing the styling to bold instead?

  • 2 months later...
Posted

When referring to the components of a character, Outlier will often say that a component either "indicates", "points to" or "hints at" a certain meaning.

Is the decision whether to use "indicates", "points to" or "hints at" made at random? Or do they mean different things?

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...