Jump to content
Chinese-Forums
  • Sign Up

Unicode (UTF-8) in HTML


fkj

Recommended Posts

The HTML looks good. Perhaps the webserver is always serving the page with the Latin-1 (ISO-8859-1) encoding. For some reason the META tag isn't overriding it. If you have control over the server, you might look into changing the default encoding used.

If the page works elsewhere, it's probably the server.

Link to comment
Share on other sites

Strange that your meta tag isn't working. You should be able to get around it by putting the following line FIRST in your HTML, where "n" is equal to a newline. Leave out the surrounding quotation marks too. This will force the page to UTF-8 regardless of server settings or META tags in the document.

"Content-Type: text/html; charset=UTF-8 nn"

As long as this text is the first thing on the page, browsers should not display it.

Link to comment
Share on other sites

We had a similar topic before, but that was for Macs. Might be useful.

My problems with placing UTF-8 chinese directly into HTML (rather than pulling it in from a database) have always been with my desktop encoding the file as GB2312 automatically, and having to use a text-editor (notepad will do) that allows you to specify the encodiing to save as. Hope that helps.

Roddy

Link to comment
Share on other sites

chinesetools got it. These are the headers that my browser is getting from your server with the sample HTML document:

HTTP/1.1 200 OK
Date: Sat, 03 Dec 2005 13:15:20 GMT
Server: Apache/2.0.49 (Win32) PHP/5.0.3 mod_jk2/2.0.0
Last-Modified: Fri, 02 Dec 2005 18:10:23 GMT
Etag: "1cd0e-10f-7baed7ab"
Accept-Ranges: bytes
Content-Length: 271
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=ISO-8859-1

See the last line? That's what your/my browser is using to set the document encoding when it gets the document from the server. You can change that line by adding some code to the .htaccess file (or create a new .htaccess) in the root folder of your web space. For this single file:

   AddCharset UTF-8 .html

For all HTML documents:

AddCharset UTF-8 .html

For all HTML/text/CGI documents or programs:

AddDefaultCharset UTF-8

See here for a longer explanation:

http://www.w3.org/International/questions/qa-htaccess-charset

Link to comment
Share on other sites

msittig, How do you display the HTTP code?

On windowsxp, an easy way is to use telnet:

Run: telnet

type: o www.scmp.com 80

hit: enter

type (case sensitive): HEAD / HTTP/1.1

hit: enter

type: host: www.scmp.com

hit twice: enter

HTTP/1.1 200 OK
Server: Netscape-Enterprise/3.6 SP3
Date: Sun, 04 Dec 2005 01:37:09 GMT
Content-type: text/html
Connection: close

or if you know the html filename, try HEAD /path/filename.html HTTP/1.x

For example, for http://www.fkj.dk/chinese/test.htm you would do this:

in telnet, type: o www.fkj.dk 80

enter

type: HEAD /chinese/test.htm HTTP/1.0

hit: enter twice

You can get the full source code of that page if you change HEAD to GET

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...