How to: MySQL PHP umlauts and glyphs not displayed correct UTF-8 to ISO


Everyone knows the problem, that for some reason strings are written in the wrong encoding into the database. If this happened, it can be seen that, that characters like these appear:

'¦, '¨, '?, '´, '¸, 'À, 'Á, 'Â, 'Ã, 'Ä, 'Å, 'Æ, 'Ç, 'È, 'É, 'Ê, 'Ë,
'Ì, 'Í, 'Î, 'Ï, 'Ñ, 'Ò, 'Ó, 'Ô, 'Õ, 'Ö, 'Ø, 'Ù, 'Ú, 'Û, 'Ü, 'Ý, 'Þ,
'ß, 'à, 'á, 'â, 'ã, 'ä, 'å, 'æ, 'ç, 'è, 'é, 'ê, 'ë, 'ì, 'í, 'î,
'ï, 'ð, 'ñ, 'ò, 'ó, 'ô, 'õ, 'ö, 'ø, 'ù, 'ú, 'û, 'ý, 'þ, 'ÿ

The problem is, that these characters are not encoded in utf8, but in utf8 been depicted are, which can lead to a variety of reasons.

Convert ISO encoded strings to UTF-8

To avoid this,, try

$string = utf8_encode($string);

..

Review the coding

The encoding of strings can be checked with the function mb_detect_encoding.

echo mb_detect_encoding($string);

For a quick and dirty fix you can take the following solution:

if(mb_detect_encoding($string) != 'UTF-8') {          $string = utf8_encode($string); }

Change the database connection coding

Another source of error, is the transfer of data to the database, This should be used always once after opening the database connection to UTF-8:

...
mysql_connect();
mysql_query("SET NAMES 'utf8'");

Download UTF-8 encoded PHP files into an ISO-encoded project

If accidentally UTF-8 encoded PHP files are loaded, it may occur, that in spite of all effort on UTF-8 encoding is converted.

The following will help:

require_once "utf-8.php"
header('Content-Type: text/html; charset=ISO-8859-1');

Helper function to encode arrays to UTF-8

UTF-8 to encode a simple recursive function to a multidimensional array is (performance would be to work with references):

function utf8encodeArray($array)
{
        foreach($array as $key =>  $value)
        {
            if(is_array($value))
            {
                $array[$key] = utf8encodeArray($value);
            }
            elseif(!mb_detect_encoding($value, 'UTF-8', true))
            {
                $array[$key] = utf8_encode($value);
            }
        }
}

The header

It should also be checked, If the header of the HTML document is set to UTF8:

<head>
       <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
 </head>

or with PHP

header ('Content-type: text/html; charset=utf-8');

File encoding

The encoding of the PHP file must be UTF-8 also, otherwise umlauts and glyphs will be destroyed (can e.g.. with Notepad checked and changed: Main Menu>Coding>UTF-8 ). In any good IDE the encoding can be changed or e set (on project or file basis).

Other sources of error

You can lose the encoding of strings or document in many ways. Particularly insidious are PHP functions for string manipulation, automatically convert the string to UTF8 and return, What is a problem, If the Web page is encoded in ISO-8859-1. Unfortunately, I have found no longer functions, I would love feedback on the topic.

If it is too late and the data have been stored in the database, one can replace the wrong umlauts as follows:

private function getUmlauteArray() { return array( 'ü'=>'ü', 'ä'=>'ä', 'ö'=>'ö', 'Ö'=>'Ö', 'ß'=>'ß', 'à '=>'à', 'á'=>'á', 'â'=>'â', 'ã'=>'ã', 'ù'=>'ù', 'ú'=>'ú', 'û'=>'û', 'Ù'=>'Ù', 'Ú'=>'Ú', 'Û'=>'Û', 'Ãœ'=>'Ü', 'ò'=>'ò', 'ó'=>'ó', 'ô'=>'ô', 'è'=>'è', 'é'=>'é', 'ê'=>'ê', 'ë'=>'ë', 'À'=>'À', 'Á'=>'Á', 'Â'=>'Â', 'Ã'=>'Ã', 'Ä'=>'Ä', 'Ã…'=>'Å', 'Ç'=>'Ç', 'È'=>'È', 'É'=>'É', 'Ê'=>'Ê', 'Ë'=>'Ë', 'ÃŒ'=>'Ì', 'Í'=>'Í', 'ÃŽ'=>'Î', 'Ï'=>'Ï', 'Ñ'=>'Ñ', 'Ã’'=>'Ò', 'Ó'=>'Ó', 'Ô'=>'Ô', 'Õ'=>'Õ', 'Ø'=>'Ø', 'Ã¥'=>'å', 'æ'=>'æ', 'ç'=>'ç', 'ì'=>'ì', 'í'=>'í', 'î'=>'î', 'ï'=>'ï', 'ð'=>'ð', 'ñ'=>'ñ', 'õ'=>'õ', 'ø'=>'ø', 'ý'=>'ý', 'ÿ'=>'ÿ', '€'=>'€' );

public function fixeUmlauteDb() {                  $umlaute = $this->getUmlauteArray();                  foreach ($umlaute as $key => $value){                                         $sql = "UPDATE table SET tracks = REPLACE(row, '{$key}', '{$value}') WHERE row LIKE '%{$key}%'";                   } }

Attention: The script works only, Who is the encoding of the PHP file is UTF-8 (can e.g.. be changed checked with Notepad /: Main Menu>Coding>UTF-8 ).

The script only works with a UTF8 encoded project (see article: PHP puzzle).

Hat Ihnen der Artikel geholfen bei Ihrem Umlaute-Problem?

Ergebnis der Umfrage anzeigen

Loading ... Loading ...

Leave a Reply


8 + three =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

  1. ISO encodings never see “random” like valid UTF-8, therefore can be checked clearly, whether a straight Windows alias is available ISO-8859-1 encoding or not.

    if ( (strlen($s) > 0) && ‘ISO-8859-1′ == mb_detect_encoding($search, ‘UTF-8, ISO-8859-1′, true))
    $s = utf8_encode( $s );

    • Thanks for pointing, I did not know the function, I've grown in the same article!

  2. >The problem is, that these characters are not encoded in utf8, but have been stored in utf8, because the DB column has been defined as (Kolation: utf8_general e.g.:).

    In my experience, this looks so if strings in a multibyte character set (z.B. UTF-8) exist and are stored in a DB column, has the a single-byte character set (z.B. latin1). e.g. or three bytes are taken together actually a special character – the single byte character individually interprets each byte and saves it as.

    But there are many (common) Combinations can be something in the DB.

  3. Hello, input is: “The problem is, that these characters are not encoded in utf8, but have been stored in utf8, because the DB column has been defined as (Kolation: utf8_general e.g.:).”
    1. It says “Collation”, so with two “L”
    2. only the collation is the collation, Here is not the problem but, This is the character set (encoding).
    Please correct.
    Greeting franc

    • Thank you very much for your help franc, is changed. You're completely right, that the collation is only the sort order.
      This affected, whether the Ä as ae or a when sorting is replaced z.B.

  4. Pingback: [PHP/MySQL] Store special characters

  5. Hello Sebastian,

    I have a problem. And while I write in my DB data. The individual pages are encoded in UTF-8 without BOM and the database also.
    If I submit the information with “mb_detect_encoding” check, then will I also “UTF-8″ returned.

    And yet I have to say the strange characters in my database. When I again read out the data, then everything is OK – other words on the page, there are apparently no problems.

    I have the command “mysql_query(“SET NAMES'utf8′”);” tried, before I write the data in the DB. Thus, the data in the DB is correct, so without special characters, However, I have the hieroglyphics then on the page again.

    The header is set to UTF-8.

    Can say what?
    Thank you in advance!

    • Hi,
      It may be only on the file encoding, diwe on ISO is. Attempt to change the file in Notepad ….

  6. In the URL, special characters is available but I find nirgendsetwas to special characters. So I was here not looking for