Mysql is provided as an implementation of utf8 storing gbk output
- 2020-05-06 11:45:41
- OfStack
A station may undergo a conversion from gb2312(gbk,big5) to utf8, where many problems are encountered. The site is too large how to do, can come step by step. Doing the data conversion first, with very little change to the front-end code, would make the whole thing much easier. After several days of testing, it was found that it was possible to store gbk output in utf8. Es7en4.1 has a feature that specifies the character set to be used for the current client connection. mysql defaults to latin1, or is collated by the character set configured for mysql server. I use utf8_general_ci to create the field.
DB:
SQL code:
PHP:
The storage operation specifies connection collation using the utf8 character set, and the read operation specifies connection collation using the gbk character set.
PHP code:
1. mysql must put gbk gb2312, utf8 character sets, such as compiled in.
2. The data contents stored in must be guaranteed to be the most correct UTF8 encoding.
3. The storage and read operation specifies the correct character set for connection collation.
If the front-end code manipulation data store cannot be performed as UTF8, then characters need to be transcoded. (for example, the data submitted with AJAX is the correct UTF8, which is not converted.)
Since mb_string is the most complete character supported by PHP, and iconv is a bit worse, mb_string does not fully support transcoding of some special characters, so there is no perfect transcoding method so far.
Again, mb_string and iconv are compared:
mb_string:
1. supports the most complete
characters 2. The content of automatically recognizes the encoding, which does not need to determine the original character encoding, but the execution efficiency of
is much lower than that of iconv 3. $content = mb_convert_encoding ($content "UTF - 8", "GBK, GB2312, BIG5"); (effects vary by order)
iconv:
1. does not support
2. needs to determine the encoding of the original characters, but
is more efficient than mb_convert_encoding when encoding is determined 3. $content = iconv("GBK", "UTF-8", $content);
DB:
SQL code:
Create TABLE `table` (
`id` INT( 10 ) NOT NULL ,
`name` VARCHAR( 50 ) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ,
INDEX ( `g_id` )
) ENGINE = innodb CHARACTER SET utf8 COLLATE utf8_general_ci;
PHP:
The storage operation specifies connection collation using the utf8 character set, and the read operation specifies connection collation using the gbk character set.
PHP code:
<?php
// Select DB And Set Link Use UTF8
function _select_db_utf()
{
mysql_select_db($this->db_name, $this->db_link);
// init character
mysql_query("SET NAMES utf8", $this->db_link);
mysql_query("SET CHARACTER SET utf8", $this->db_link);
mysql_query("SET COLLATION_CONNECTION='utf8_general_ci'", $this->db_link);
return true;
}
// Select DB And Set Link Use GBK
function _select_db_gb()
{
mysql_select_db($this->db_name, $this->db_link);
// init character
mysql_query("SET NAMES gbk", $this->db_link);
mysql_query("SET CHARACTER SET gbk", $this->db_link);
mysql_query("SET COLLATION_CONNECTION='gbk_chinese_ci'", $this->db_link);
return true;
}
?>
A few things to note: 1. mysql must put gbk gb2312, utf8 character sets, such as compiled in.
2. The data contents stored in must be guaranteed to be the most correct UTF8 encoding.
3. The storage and read operation specifies the correct character set for connection collation.
If the front-end code manipulation data store cannot be performed as UTF8, then characters need to be transcoded. (for example, the data submitted with AJAX is the correct UTF8, which is not converted.)
Since mb_string is the most complete character supported by PHP, and iconv is a bit worse, mb_string does not fully support transcoding of some special characters, so there is no perfect transcoding method so far.
Again, mb_string and iconv are compared:
mb_string:
1. supports the most complete
characters 2. The content of automatically recognizes the encoding, which does not need to determine the original character encoding, but the execution efficiency of
is much lower than that of iconv 3. $content = mb_convert_encoding ($content "UTF - 8", "GBK, GB2312, BIG5"); (effects vary by order)
iconv:
1. does not support
2. needs to determine the encoding of the original characters, but
is more efficient than mb_convert_encoding when encoding is determined 3. $content = iconv("GBK", "UTF-8", $content);