geek_frappa Posted July 2, 2004 at 03:05 PM Report Posted July 2, 2004 at 03:05 PM is Unicode what everyone is using? Quote
pazu Posted July 12, 2004 at 02:52 PM Report Posted July 12, 2004 at 02:52 PM In my website, a mix of both... I should really change everything into Unicode but I'm too lazy. My new travel pages are in unicode, but the core is still in BIG5. Quote
geek_frappa Posted July 12, 2004 at 03:03 PM Author Report Posted July 12, 2004 at 03:03 PM so, is unicode better? should i work towards converting to unicode and not using Big5 or GB? (ha, i'm lazy too. ) Quote
imron Posted July 13, 2004 at 02:10 AM Report Posted July 13, 2004 at 02:10 AM Unicode is far and away better than BIG5 or GB***. When dealing with Chinese, the main reason for this is that you can happily intermix Traditional and Simplified characters with Unicode, but with the other encodings it's usually an either/or kind of thing. However, in addition, Unicode also supports and allows you to intermix many other languages e.g. Thai, Korean, Arabic, Hebrew etc and even fictional ones like Tolkien's runic languages etc. Basically the aim of Unicode is to be all encompassing, and currently it does a better job of that, than anything else out there. See this page for a nice introduction: http://www.unicode.org/standard/WhatIsUnicode.html Now, there are those that will say "aha, but the recent (-ish) GB18030 standard has both Traditional and Simplified forms so you can just use that, and in addition, like unicode, GB18030 also has support for all these other languages". However if you do a bit of research regarding these encodings, you will find out that GB18030 is essentially just a mapping table that maps GB code points to Unicode. In fact, the GB18030 standard is defined mostly in terms of Unicode codepoints. A great introduction of the issues involved for programmers can be found here: http://oss.software.ibm.com/icu/docs/papers/gb18030.html And an xml mapping table for the GB18030 standard can be found here: http://oss.software.ibm.com/cvs/icu/~checkout~/charset/data/xml/gb-18030-2000.xml?rev=1.4&content-type=text/plain However rather than using this mapping table directly to convert between codesets, you're better of using the support of the OS you are using to do this. Under linux this can be done with libiconv, and under windows you can use the WideCharToMultibyte, and MultibyteToWideChar API calls. Anyway, put simply, Unicode is a global standard, and GB18030 is China's way of maintaining backwards compatibility with previous GB standards, while still remaining future-compatible with the rest of the world. Plus they get the "face" of maintaining their own encoding system. I'm not too sure on recent developments with the BIG5 encoding, so I can't really say too much on that. However separate encodings for separate languages are a remnant of an unconnected world, and don't really fit in with how things are today. Personally, the sooner the rest of the world starts using Unicode, the better off we'll all be. Quote
pazu Posted July 13, 2004 at 02:15 PM Report Posted July 13, 2004 at 02:15 PM Agree with Imron. Big5 is inferior than GB and GB is inferior than unicode in most situations. Unless your visitors are all from Hong Kong/Taiwan, or from mainland China, you should use unicode. I'm so happy that now I can mix up everything in my writing, traditional Chinese and a Japanese Kanji, then some kana and Vietnamese. 用統一码, 打廣東話又得, 写普通话也行, 日本語もできる、tiếng Việt cũng được. It's a dream finally comes true, and I am so glad to have finally seen this day by my eyes! haha. Quote
roddy Posted July 13, 2004 at 04:01 PM Report Posted July 13, 2004 at 04:01 PM 用統一码, 打廣東話又得, 写普通话也行, 日本語もできる、tiếng Việt cũng được and that's why I have 'utf-8' at the top of all my pages. Roddy Quote
benotnobody Posted July 14, 2004 at 07:38 AM Report Posted July 14, 2004 at 07:38 AM if you're using the Traditional Chinese IME 2002a under Windows XP, is there some way you can configure it to output in utf-8??? Because I don't like using NJStar Communicator, even though it can do unicode. Quote
Claw Posted July 14, 2004 at 11:00 AM Report Posted July 14, 2004 at 11:00 AM if you're using the Traditional Chinese IME 2002a under Windows XP, is there some way you can configure it to output in utf-8??? Because I don't like using NJStar Communicator, even though it can do unicode. The Windows XP IME always outputs in unicode. If you save the text to a file, depending on the program, you can save it in different encoding schemes (UTF-8, Big5, GB, etc.). If you submit the text on a web page (like this forum for instance), it submits it in the encoding scheme that the web page uses (which for this site, as roddy says, is UTF-8 ). Quote
Konglong Posted July 15, 2004 at 03:48 AM Report Posted July 15, 2004 at 03:48 AM And since Win2k, the Microsoft OS's have been built upon Unicode. All my webpages I do, whether I am using them on my site or on my PocketPC, I stick with UTF8. I agree with the posts above. UTF-8 in web pages is the way to go. The Big5 vs. GB is just a headache that will eventually have to come to an end. K Quote
imron Posted July 15, 2004 at 08:22 AM Report Posted July 15, 2004 at 08:22 AM And since Win2k, the Microsoft OS's have been built upon Unicode Actually, all versions of WinNT were Unicode too. It's only the 'home' OSes (3.xx, 95, 98, Me etc) that didn't use it. This is why Win2k and WinXP have it, because they are basically built on top of WinNT. Quote
benotnobody Posted July 15, 2004 at 08:37 AM Report Posted July 15, 2004 at 08:37 AM thanks for that. Quote
pazu Posted July 25, 2004 at 02:22 PM Report Posted July 25, 2004 at 02:22 PM But how about unicode on PalmOS? Unicode Chinese/Japanese/Korean? Quote
geek_frappa Posted August 9, 2004 at 02:16 AM Author Report Posted August 9, 2004 at 02:16 AM most pages on the mainland still use GB?!!! because people are still using win98, winME??? YiSou and Baidu don't appear to be Unicode-"friendly" although they encode their URLs... hmmm Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.