View Full Version : Text encoding issues in db from upgrade or FQ? : phpMyAdmin 2.11.9.3 from 2.11.4?
Starbuck
12-19-2008, 03:26 PM
I am having some text encoding issues with my site.
I think the culprit is this code which was added to my db from somewhere:
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
where did this code come from?
Was it added by my upgrade to phpMyAdmin from 2.11.4 to 2.11.9.3? or did FQ add?
Original old db:
-- phpMyAdmin SQL Dump
-- version 2.11.4
-- http://www.phpmyadmin.net
--
-- Host: MySQL.mydomain.com
-- Generation Time: Nov 08, 2008 at 03:43 PM
-- Server version: 4.0.27
-- PHP Version: 4.4.4
--
-- Database: `xmydomain-db`
--
-- --------------------------------------------------------
Current db with text encoding issues:
-- phpMyAdmin SQL Dump
-- version 2.11.9.3
-- http://www.phpmyadmin.net
--
-- Host: MySQL.mydomain.com
-- Generation Time: Dec 19, 2008 at 02:16 PM
-- Server version: 4.0.27
-- PHP Version: 5.2.6
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
--
-- Database: `xmydomain-db`
--
-- --------------------------------------------------------
ps. my site is latin1 I believe and foreign characters are showing up like this: Kübeckgasse instead of Kübeckgasse
Where did this come from + What should I do?:umm:
Terra
12-19-2008, 04:21 PM
did FQ add?
No, we did not add and is most likely from the upgrade...
Starbuck
12-19-2008, 06:45 PM
I am a bit lost here, everything was fine until last month. The best way I can describe the issue is to list what happened since Nov.8.
I have contacted Zen Cart and they seem to this issue is due to something at my here at FQ.
Perhaps someone with more knowledge can spot what the issue is?
Thanks everyone for any assistance...
Nov 8 (Last backup before FQ moves my site to PHP 5.2.6)
-- phpMyAdmin SQL Dump
-- version 2.11.4
-- http://www.phpmyadmin.net
--
-- Host: MySQL.mydomain.com
-- Generation Time: Nov 08, 2008 at 03:43 PM
-- Server version: 4.0.27
-- PHP Version: 4.4.4
--
-- Database: `xmydomain-db`
--
-- --------------------------------------------------------
Next Back up Nov 19. Dumped with new version of phpMyAdmin 2.11.9.3 Version compatible with PHP 4+ and MySQL 3+.:
Site seems ok, with no obvious text issues.
-- phpMyAdmin SQL Dump
-- version 2.11.9.3
-- http://www.phpmyadmin.net
--
-- Host: MySQL.mydomain.com
-- Generation Time: Nov 19, 2008 at 06:15 PM
-- Server version: 4.0.27
-- PHP Version: 4.4.4
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
--
-- Database: `xmydomain-db`
--
-- --------------------------------------------------------
yada yada yada....
) TYPE=MyISAM AUTO_INCREMENT=122 ;
Next Dump Nov 30 same as above.
Dec 1. 2008 I request my db to be moved/upgraded to MySQL 5 by FQ
Dec 3. Next Back up dumped with newer version of phpMyAdmin SQL Dump
-- version 3.1.0 (Version compatible with PHP 5 and MySQL 5.)
-- phpMyAdmin SQL Dump
-- version 3.1.0
-- http://www.phpmyadmin.net
--
-- Host: MySQL.mydomain.com
-- Generation Time: Dec 03, 2008 at 09:12 AM
-- Server version: 5.0.44
-- PHP Version: 5.2.6
SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
--
-- Database: `xmydomain-db `
--
-- --------------------------------------------------------
This code is changed also:
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=122 ;
All data in db converted to UTF-8:
mˆllevÂngsgatan in old db is now möllevångsgatan
Site now showing issues (text encoding).
Browser issues (browser told to see in latin1 Western, when data is UTF-8)
data issues Zen Cart writing in latin1 to a UTF-8 db.
Small freak out :ytslobber:and quick retreat to back to the way things were (FQ please: move my db back to MySQL 4!)
Dec. 5 db now on Server version: 4.0.27-log MySQL client version: 5.0.54 - db Dumped with old version of phpMyAdmin 2.11.9.3 Version compatible with PHP 4+ and MySQL 3+.:
-- phpMyAdmin SQL Dump
-- version 2.11.9.3
-- http://www.phpmyadmin.net
--
-- Host: MySQL.mydomain.com
-- Generation Time: Dec 05, 2008 at 10:09 AM
-- Server version: 4.0.27
-- PHP Version: 5.2.6
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
--
-- Database: `xmydomain-db`
--
-- --------------------------------------------------------
) TYPE=MyISAM AUTO_INCREMENT=122 ;
Everything back to normal, all data in db back to original latin1 format: (mˆllevÂngsgatan) not (möllevångsgatan) all browsers displaying correctly.
Dec. 19: First new customer with foreign address and text encoding issue reappears:
He enters: Kübeckgasse site displays Kübeckgasse
Address in db: Kübeckgasse which looks like latin1
I email FQ the issue they respond:
We did not add that code. It is simply the new mysqldump format for MySQL version 5. The problem with PHP is that it must be linked against a specific version of the MySQL client libraries regardless of what version MySQL server your account is using. In the past we left PHP linked against the MySQL version 4 libraries because they worked with either version of the server. In mid-November we started having compatibility issues with the MySQL libraries so we linked PHP against the MySQL version 5 libraries instead of the version 4 libraries. It wouldn't matter which version of MySQL you were actually using at the time.
The code was added to the mysqldump format my the MySQL v5 client libraries. Since MySQL v4 doesn't support character sets PHP had no idea what the character set was so it assumed the data was the v5 default of UTF-8 and specified that in the mysqldump. However the data itself was really Latin1.
When your databases were upgraded to v5 we ran them through our standard conversion script which actually did convert the data from Latin1 to UTF-8.
I can't really make sense of the statements because my db looks like its in latin1 and should be on a MySQL 4 server.
Completely lost now...:ytstars: I thought I went back to the way it was last month?
Not sure if this means anything but in my Zen cart admin under server info I see PHP Version: 4.4.4 (Zend: 1.3.0) while cnc says PHP Version 5.2.6 Zend Engine v2.2.0 Zend Optimizer v3.2.2
Terra
12-19-2008, 08:21 PM
Unfortunately, this is going to be an issue between you and the Zen Cart developers...
They would know better how their software handles character encoding, and if something is wrong, it would seem they need to fix it...
In a nutshell, handling character encoding is a major pain to get right for software, however we have a lot of sites using foreign language character sets without any problems... This in turn leads me to believe it is a Zen Cart problem...
Your site is running on:
Apache 1.3.41
PHP 5.2.6 linked against the MySQL 5.0.54 libraries
MySQL 4.0.27 database engine
Feel free to pass that along...
Also, I would recommend moving back up to MySQL 5.x, because MySQL 4.x is going to be deprecated very soon...
Starbuck
12-20-2008, 10:09 AM
Thanks Terra for your help, this is all very confusing to me everything was running great until last month.
Ok, so to sum it up:
Last month FQ upgraded to PHP 5.2.6 and linked against MySQL 5.0.54 libraries.
The code (/*!40101...) was added to the mysqldump format by the MySQL v5 client libraries. and was not created by phpMyAdmin.
Because Zen Cart is set to latin1 and now linked to a MySQL 5.0.54 libraries. I am now having text encoding issues Latin1 vs. UTF-8
Any special character entered by customer :
Kübeckgasse goes into the db as it always has as latin1: K√ºbeckgasse but comes out UTF-8 displayed as: Kübeckgasse (ZC default PHP sets Browser to latin1)
So then.... All Zen Carts on FQ will have text encoding issues due to that fact PHP 5.2.6 is linked against MySQL 5.0.54 libraries (UTF-8) regardless of what MySql server they are on 4 or 5 correct?
So have other Zen Carts reported this issue?
Starbuck
12-22-2008, 04:02 PM
I have made some enquiries at ZC's forum and gotten a variety of suggestions and different directions to go in. I am however, hesitant to make sweeping changes without knowing the cause of this issue, as most ZC users have not had issues upgrading PHP or MySQL 5 with other hosts. I also do not want to stray to far from the ZC core code/db structure.
So My question is, since the only thing that has changed on my site to create this UTF-8 output issue is the upgrade from PHP 4 to PHP 5 by FQ. Is there something about FQ's "PHP 5.2.6 is linked against MySQL 5.0.54 libraries" that would specifically only allow utf8 encoding?
Again my ZC is set to western latin1, all data in db is latin1 and is stored as latin1, but all NEW data now comes out of my db displayed in UTF-8.
db is still on the MySQL 4.0.27 database engine
Arthur
12-23-2008, 06:59 AM
I have done some digging and here are my findings, in a nutshell (somewhat simplified here and there) -
Your data is encoded in latin1 (iso-8859-1) and is stored as such in your database (MySQL 4.0). The SQL headers (/*!40101 SET NAMES utf8 */) are added by phpMyAdmin and do not reflect the actual encoding.
MySQL 4.0 and PHP 4 do not know or care about the charset/encoding of the data, later versions (MySQL 4.1 and up and PHP 5 and up) do know about it and can handle different encodings. The default encoding for these later versions is UTF-8.
If you connect to a MySQL 4.0 database with PHP, the MySQL server will simply return the binary data, it doesn't return any information about the encoding of the data (it has no idea).
In PHP 4 the data that is returned by the MySQL server is handled as binary data, again without any information about the encoding. The data is sent as-is to the user's web browser, which will look at the HTML code to see if a charset is defined and will display the HTML code in the defined charset or if there's none present it will display it in the default charset.
PHP 5 knows about different charsets/encodings. If no encoding is defined, and this is the crucial part, it will assume the data is encoded in utf-8. You connect to a MySQL 4.0 database with PHP 5, the MySQL server doesn't return information about the encoding and PHP will internally handle the data as utf-8.
It is down to the PHP application to handle the data correctly, either by setting the encoding of the database connection to latin1, or outputting HTML pages that define the charset as utf-8 so that browsers will display the page correctly.
MySQL 4.1 and 5+ allow you to define the character set and collation used by your databases, you can even define different settings per field. Applications can then use this information to handle the data from your database correctly.
Is there something about FQ's "PHP 5.2.6 is linked against MySQL 5.0.54 libraries" that would specifically only allow utf8 encoding?
No, it will allow other encodings, however PHP 5 will default to utf-8 if no other encoding is explicitly defined.
phpMyAdmin 'sees' PHP is setting the encoding to the default, utf-8, and is adding the appropriate SQL headers for utf-8 in the SQL dump file.
ZenCart is apparently unaware of any encoding and is handling the utf-8 data as if it were encoded in latin1.
-Arthur
Starbuck
12-23-2008, 10:19 AM
Thank you so much Arthur for your concise explanation, I have been doing a lot of reading/googling about this issue, but most of it is very hard to understand and easy to bark up the wrong tree. Your answer made perfect sense! :ytteach:
I guess I will either somehow set my Default Character set to latin1 in the db or change the ZC's HTML handling from western to UTF-8. Zen Cart's next BIG upgrade requires that I re-encode language files as UTF-8-without-BOM format. So maybe changing ZC's HTML handling from western to UTF-8 is the way to go for now?
My goal is to have my site both on PHP 5 and MySQL 5 and not complicate Zen Cart upgrades. My recent attempt to move Zen Cart on to MySQL 5, further exemplifies your statement:
http://www.aota.net/forums/showthread.php?t=24371
here is a curious thing:
I have continued to do testing on my site inputing special characters (from foreign languages) so far using my USA keyboard I have not been able to replicate this issue:
A recent customer input Kübeckgasse on my ZC site. In the db this is written as K√ºbeckgasse, and displayed as Kübeckgasse. If I switch the browsers text encoding from Western to UTF-8 Kübeckgasse becomes Kübeckgasse.
using my USA keyboard/typing in Austrian:
I enter Kübeckgasse on my ZC site. In the db this is written as K¸beckgasse, and displayed as Kübeckgasse:ytrubeye: .If I switch the browsers text encoding from Western to UTF-8 Kübeckgasse becomes K�beckgasse.
Perhaps foreign keyboards input UTF-8?
Arthur
12-23-2008, 10:51 AM
So maybe changing ZC's HTML handling from western to UTF-8 is the way to go for now? I couldn't say for sure, but I think that may well be the way to go.
Perhaps foreign keyboards input UTF-8? No, that would be a browser issue. Keyboards don't do character encoding, they just send an electrical signal to the computer, which is processed and sent to the OS and then processed by whichever application is listening for input (here; your web browser, Safari).
If the page's charset is set to iso-8859-1, the browser will send output from a form in that charset.
If you see the "diamond with the question mark" that's usually a sign the text is encoded in iso-8859-1 (or iso-8859-15) and displayed as UTF. If you see two or three characters instead of one, that usually a sign the text is encoded in UTF and displayed as iso-8859-1 (UTF-8 uses 3 bytes per character).
When you say "In the db this is written as Kübeckgasse", remember that that is how the MySQL client displays it, it's not (not necessarily) exactly what is stored in the database.
-Arthur
vBulletin® v3.6.8, Copyright ©2000-2013, Jelsoft Enterprises Ltd.