Mysql is provided as an implementation of utf8 storing gbk output

  • 2020-05-06 11:45:41
  • OfStack

A station may undergo a conversion from gb2312(gbk,big5) to utf8, where many problems are encountered. The site is too large how to do, can come step by step. Doing the data conversion first, with very little change to the front-end code, would make the whole thing much easier. After several days of testing, it was found that it was possible to store gbk output in utf8. Es7en4.1 has a feature that specifies the character set to be used for the current client connection. mysql defaults to latin1, or is collated by the character set configured for mysql   server. I use utf8_general_ci to create the field.  
DB:
SQL code:

Create TABLE `table` (   
`id` INT( 10 ) NOT NULL ,   
`name` VARCHAR( 50 ) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ,   
INDEX ( `g_id` )    
) ENGINE = innodb CHARACTER SET utf8 COLLATE utf8_general_ci;  


PHP:
The storage operation specifies connection collation using the utf8 character set, and the read operation specifies connection collation using the gbk character set.


PHP code:

<?php   
        //    Select    DB    And    Set    Link    Use    UTF8   
        function    _select_db_utf()   
        {   
        mysql_select_db($this->db_name,    $this->db_link);   


        //    init    character   
        mysql_query("SET    NAMES    utf8",    $this->db_link);   
        mysql_query("SET    CHARACTER    SET    utf8",    $this->db_link);   
        mysql_query("SET    COLLATION_CONNECTION='utf8_general_ci'",    $this->db_link);   


        return    true;   
        }   


        //    Select    DB    And    Set    Link    Use    GBK   
        function    _select_db_gb()   
        {   
        mysql_select_db($this->db_name,    $this->db_link);   


        //    init    character   
        mysql_query("SET    NAMES    gbk",    $this->db_link);   
        mysql_query("SET    CHARACTER    SET    gbk",    $this->db_link);   
        mysql_query("SET    COLLATION_CONNECTION='gbk_chinese_ci'",    $this->db_link);   


        return    true;   
        }   
?>  
A few things to note:
1.   mysql must put gbk gb2312, utf8 character sets, such as compiled in.
2. The data contents stored in   must be guaranteed to be the most correct UTF8 encoding.
3. The   storage and read operation specifies the correct character set for connection collation.

If the front-end code manipulation data store cannot be performed as UTF8, then characters need to be transcoded. (for example, the data submitted with AJAX is the correct UTF8, which is not converted.)

Since mb_string is the most complete character supported by PHP, and iconv is a bit worse, mb_string does not fully support transcoding of some special characters, so there is no perfect transcoding method so far.

Again, mb_string and iconv are compared:

mb_string:
1.   supports the most complete
characters 2. The content of   automatically recognizes the encoding, which does not need to determine the original character encoding, but the execution efficiency of
is much lower than that of iconv 3.   $content   =   mb_convert_encoding ($content   "UTF - 8",   "GBK, GB2312, BIG5"); (effects vary by order)

iconv:
1.   does not support
2.   needs to determine the encoding of the original characters, but
is more efficient than mb_convert_encoding when encoding is determined 3.   $content   =   iconv("GBK",   "UTF-8",   $content);

Related articles: