Jump to content
Chinese-Forums
  • Sign Up

Information Technology Vocabulary and Developing Chinese Language Web Sites with PHP


alexamies

Recommended Posts

I just posted several pages on the topic of vocabulary for information technology at chinesenotes.com among which are

On each page you can click on any word to see a summary of it's meaning and use, including MP3 recordings. To support this I constructed a vocabulary database, which also allows users to search conveniently. I also wrote an article for developing Chinese language web sites with PHP:

Processing Chinese Text with PHP

I hope that someone gets something from these. If anyone has any suggestions please reply to this post.

Link to comment
Share on other sites

Looks interesting. I'm interested in finding some sort of resource which contains as many computer terms as possible, that I could possibly turn into a dictionary for Pleco, and use as a way to start learning how to do all the stuff I can in English on a computer, but in Chinese.

Found a couple of other websites, but I'm not sure how up-to-date they are, and they're at home at the moment. I'll post them when I get home.

Link to comment
Share on other sites

What are the problems handling Chinese with PHP? GB2312 works fine. UTF-8 is backwards compatible to latin1, and so doesn't break PHP. The only real issues are string manipulation functionality breaking with UTF-8 text. But you can get around that by converting the text to hex before manipulating, and converting back afterwards.

@ipsi - we have a host of computer terminology in the Adso project. You're welcome to create a Pleco-compatible file and distribute it if you'd like. The version Mike is distributing is about 60,000 entries out of date.

Link to comment
Share on other sites

There are a number of problems with multi-byte character processing in general in PHP. I don't think that these are specific to UTF-8. Some examples are:

  • ord() does not work with multi-byte characters
  • ctype_* functions do not work with multi-byte characters
  • The MySQL database driver does not work well with multi-byte characters

There are work arounds for all these issues but the point is you have to know them and that is what the article is about.

As I understand it, GB2312 is a character set not an encoding so it should be compared with Unicode rather than UTF-8. I would be concerned about using a character set that does not include all major languages. It seems a great limitation, especially in the age of Web 2.0 and mash-ups, to have to restrict users to Chinese and Latin characters. Whatabout Chinese speaking people in Russia, Thailand, the Middle East, etc? There are no shortage of Chinese entrepreneurs in these places and I am sure would need to combine these languages with Chinese and English, given that English is the most common business language in the world. Local people in these places may want to learn Chinese. There are plenty of use cases where Chinese and a non-Latin script will need to be used together.

Link to comment
Share on other sites

I would, but I'd like something that's fairly specific, in that it only has computer terms, and very little else. It'd also mean that I wouldn't have to scroll past stuff I wasn't interested in if I just wanted to look at random computer-related words and their meanings.

Anyway, the two websites are:

http://ihome.ust.hk/~lbsun/terms.html

http://www.iscs.nus.edu.sg/~colips/archives/glossary/GB/d.html

I think the top one is Traditional, the other Simplified.

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...