Examples of Chinese sorting of UTF8 encoding in PHP and MYSQL

  • 2021-07-22 09:19:32
  • OfStack

This article describes the example of PHP and MYSQL UTF8 coding of the Chinese sorting method, to share for your reference. The specific implementation method is as follows:

Generally speaking, there are three sorting methods in Chinese:

1. Sort according to Pinyin;
2. Sort according to strokes;
3. Sort according to radicals.

The default sorting mode of the system is Pinyin sorting, which is also commonly used by us. The following introduction is to sort according to Pinyin

1. Need in php array with Chinese sorting, but 1 like utf8 format files, directly with asort sorting is not good. With gbk and gb 2312. This is related to the coding of several formats. The codes of gbk and gb2312 themselves are sorted in pinyin.

function utf8_array_asort(&$array)
{
if(!isset($array) || !is_array($array))
{
return false;
}
foreach($array as $k=>$v)
{
$array[$k] = iconv('UTF-8', 'GBK//IGNORE',$v);
}
asort($array);
foreach($array as $k=>$v)
{
$array[$k] = iconv('GBK', 'UTF-8//IGNORE', $v);
}
return true;
}

2. In MySQL, we often sort and query a field, but when sorting and searching Chinese characters, the sorting and searching results of Chinese characters are often wrong. This situation exists in many versions of MySQL.

If this problem is not solved, MySQL will not be able to actually process Chinese. The reason for this problem is that MySQL is case-insensitive when querying strings, and ISO-8859 character set is generally used as the default character set when compiling MySQL, so this phenomenon is caused by case conversion of Chinese encoded characters in the comparison process.

Solution:

Add the attribute "binary" to the field containing Chinese to make it a binary comparison, for example, change "name char (10)" to "name char (10) binary".

If you compile MySQL using source code, you can compile MySQL with the-with-charset=gbk parameter, so that MySQL will directly support Chinese search and sorting (the default is latin1). You can also use extra-charsets=gb2312, gbk to add multiple character sets.

If you do not want to modify the table structure or recompile MySQL, you can also use the CONVERT function in the order by section of the query statement. For example

select * from mytable order by CONVERT(chineseColumnName USING gbk);

I hope this article is helpful to everyone's PHP+MySQL programming.


Related articles: