novemberfog Posted May 11, 2006 at 08:13 AM Report Share Posted May 11, 2006 at 08:13 AM So I am working on a simple web app using PHP. I am trying to test and see if the app processes my XML file correctly. If I view the XML file in Firefox, everything looks good. The document structure is good, I can see the UTF-8 (Zhuyin Fuhao to be exact) characters just fine. I even explicitly define the encoding at the top of the XML file, just to be safe. The app seems to read the XML document just fine, but it garbles all non-english UTF-8. It leaves me with the lovely "??" instead of my Zhuyin Fuhao. Any ideas why this might be? I made sure that I view the output in UTF-8, but it does not change anything. I am assuming the data is garbled before the output stage. Anyone have any experience working with PHP, XML, and UTF-8? I included the code below. function character_data($parser, $data) { { echo utf8_encode($data); } //other two callbacks for the XML parser $parser = xml_parser_create(); xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, false); xml_set_element_handler($parser, "opening_element", "closing_element"); xml_set_character_data_handler($parser, "character_data"); if (!($fp = fopen($xmlSource, "r"))) { die("could not open XML input"); } while ($data = fread($fp, 4096)) { if (!xml_parse($parser, $data, feof($fp))) { die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser))); } } xml_parser_free($parser); Quote Link to comment Share on other sites More sharing options...
trevelyan Posted May 11, 2006 at 03:41 PM Report Share Posted May 11, 2006 at 03:41 PM PHP isn't designed to work with Unicode, so this may be an issue with the XML parsing class. You can always test by converting the text into guobiao and see if that works: $gbstring = mb_convert_encoding($utfstring, "GB2312", "UTF-8"); ... will give you a guobiao string instead of UTF8. Quote Link to comment Share on other sites More sharing options...
roddy Posted May 11, 2006 at 04:01 PM Report Share Posted May 11, 2006 at 04:01 PM The multibyte (mb) functions aren't always compiled as standard, depends on your webhosting - if not you can run them as a cgi. Quote Link to comment Share on other sites More sharing options...
badr Posted May 11, 2006 at 06:10 PM Report Share Posted May 11, 2006 at 06:10 PM i experienced a similar issue when i installed a forum software on one of the websites i manage; Check your .htaccess file Quote Link to comment Share on other sites More sharing options...
roddy Posted May 11, 2006 at 06:32 PM Report Share Posted May 11, 2006 at 06:32 PM How would an .htaccess file affect this? Quote Link to comment Share on other sites More sharing options...
novemberfog Posted May 11, 2006 at 10:03 PM Author Report Share Posted May 11, 2006 at 10:03 PM I was afraid that the problem might be with PHP itself. If I cannot get the multibyte functions to work, well, then I will probably just write the utility in perl. Edit: It appears that my webhost doe snot have the multibyte functions installed. What fun.... Quote Link to comment Share on other sites More sharing options...
roddy Posted May 12, 2006 at 04:18 AM Report Share Posted May 12, 2006 at 04:18 AM If you drop me an email (admin@) I'll see if I can find the cgi files - you drop them into your cgi-bin folder, then add a .htaccess file in your working folder to tell the server to process php files via that rather than the compiled php. Or something like that. It worked, anyway. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.