At the moment, Gentoo is experiencing several inconsistency and problems with character sets between MySQL and PHP. This is primarely based on MySQL-4.1 now updating from 4.0 without warning and user interaction which most of the times breaks existing extended characters as MySQL now stores every dump from former databases as UTF-8, which is still badly supported by PHP.
For many PHP web applications which experience problems with extended characters (like umlauts, accents, …), the following hack might help.
- Locate the file where the mysql database connection is opened.
- Add the following commands after opening the database connection:
mysql_query('SET character_set_client=latin1');
mysql_query('SET character_set_results=latin1');
mysql_query('SET character_set_connection=latin1');
This will resume using latin1 instead of UTF-8 for the connection and the result set. For performance reasons, the data in the database should then be stored as latin1 as well.
According to messages in the Gentoo Forum, the developers have now released an ebuild for PHP (both 5.x and 4.4.2) that will regard character-set settings in my.cnf in a section especially for php (still in unstable). You should use the section [php-cli], [php-cgi] and/or [php-apache2handler]. Unfortunately I have not yet had time to test this out.