Jump to content
Chinese-Forums
  • Sign Up

PHP & UTF-8: trouble loading UTF-8 data


Recommended Posts

Posted

So I am working on a simple web app using PHP. I am trying to test and see if the app processes my XML file correctly. If I view the XML file in Firefox, everything looks good. The document structure is good, I can see the UTF-8 (Zhuyin Fuhao to be exact) characters just fine. I even explicitly define the encoding at the top of the XML file, just to be safe.

The app seems to read the XML document just fine, but it garbles all non-english UTF-8. It leaves me with the lovely "??" instead of my Zhuyin Fuhao. Any ideas why this might be? I made sure that I view the output in UTF-8, but it does not change anything. I am assuming the data is garbled before the output stage. Anyone have any experience working with PHP, XML, and UTF-8? I included the code below.

function character_data($parser, $data) {
{
    echo utf8_encode($data);
}

//other two callbacks for the XML parser

$parser = xml_parser_create();
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, false);
xml_set_element_handler($parser, "opening_element", "closing_element");
xml_set_character_data_handler($parser, "character_data");


if (!($fp = fopen($xmlSource, "r"))) 
{
  die("could not open XML input");
}
while ($data = fread($fp, 4096)) 
{
  if (!xml_parse($parser, $data, feof($fp))) 
  {
      die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser)));
  }
}

xml_parser_free($parser);

Posted

PHP isn't designed to work with Unicode, so this may be an issue with the XML parsing class. You can always test by converting the text into guobiao and see if that works:

$gbstring = mb_convert_encoding($utfstring, "GB2312", "UTF-8");

... will give you a guobiao string instead of UTF8.

Posted

The multibyte (mb) functions aren't always compiled as standard, depends on your webhosting - if not you can run them as a cgi.

Posted

i experienced a similar issue when i installed a forum software on one of the websites i manage; Check your .htaccess file

Posted

I was afraid that the problem might be with PHP itself. If I cannot get the multibyte functions to work, well, then I will probably just write the utility in perl.

Edit:

It appears that my webhost doe snot have the multibyte functions installed. What fun....

Posted

If you drop me an email (admin@) I'll see if I can find the cgi files - you drop them into your cgi-bin folder, then add a .htaccess file in your working folder to tell the server to process php files via that rather than the compiled php. Or something like that. It worked, anyway. :mrgreen:

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...