Jump to content
Chinese-Forums
  • Sign Up

All possible word combinations from a set of hanzi


Recommended Posts

Posted

Does anyone know of a website where I can paste in a group of characters and it will tell me all of the possible words that these characters can produce? For example, if I pasted (or typed) 好 and 你 then it would tell me that I can make the word 你好, except on a large scale with dozens of characters? I've seen this before several years ago, but I don't know where. Can anyone help me?

Posted

www.mdbg.net may be helpful. You can type a character (or more than one) and see all words in the dictionary that start with that character, end with that character, or include that character.

Mark

Posted

I had something like that up several years ago, but it was never very polished and only worked within the scope of the HSK lists. It's not particularly difficult to do, but it could put a bit of a load on a server I guess.

Posted

You could do it with regular expressions, actually; no actual code needed. Stick a Chinese word list (for example, a list of all of the Chinese headwords from CC-CEDICT) in a text editor with grep support, then do the following find-and-replaces with grep enabled:

Find:

(

  1. )

(so it's a list of all of your characters with a ([ before them and a ]) after them)

Replace With:

@\1

(basically, adding a @ before every included character)

Then Find:

^[^@\r\n][^\r\n]*\n

And replace with nothing (deleting every word that begins with a character not in your list), also Find:

^[^\r\n]*[^@\r\n]{2,}[^\r\n]*\n

And replace with nothing (deleting every word that contains a character that's not in your list), finally Find:

@

and replace with nothing. (removing the superfluous @s)

Haven't actually tested this, and there's probably a more efficient way to do it, but I think it would work.

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...