Method steps for modifying MySQL character set utf8 to utf8mb4

  • 2021-12-09 10:18:34
  • OfStack

For mysql 5.5, if no character set is set, the default character set for mysql is the latin1 Latin character set;

However, with the further development of various services, in addition to the language characters of each country, there are often some emoticons appearing in applications. Before mysql 5.5, UTF-8 coding only supported 1-3 bytes and supported Unicode coding area of BMP; Starting from MySQL 5.5, it can support 4 bytes of UTF encoding utf8mb4, and 1 character can support more character sets and more emojis.

utf8mb4 is compatible with utf8 and can represent more characters than utf8, which is a superset of utf8 character set. So now some new services, such as ISO, will set the character set of MySQL database to utf8mb4.

Today, when dealing with an application requirement, I encountered such a problem:

Of course, the best way to adjust is the client. The character set of mysql database is modified to utf8mb4, but the amount of modification is relatively large. Moreover, if the client part 1 is modified to utf8 and the part 1 is utf8mb4, confusion will easily occur.

After several tests, the configuration of the character set in my. cnf of the MySQL database was modified to the following configuration:


[client] 
default-character-set=utf8mb4 
 
[mysqld] 
character-set-server = utf8mb4 
collation-server = utf8mb4_unicode_ci 
init_connect='SET NAMES utf8mb4' 
skip-character-set-client-handshake = true 
 
[mysql] 
default-character-set = utf8mb4 

Recommendations when using MySQL character set

Try to explicitly indicate the character set used when establishing the database/table and performing database operations, instead of relying on the default settings of MySQL, otherwise the upgrade of MySQL may bring great troubles;

When latin1 is used for both database and connection character set, although the problem of garbled code can be solved in most cases, the disadvantage is that SQL operation cannot be carried out in units of characters. In general, it is a better choice to set both database and connection character set as utf8;

When using mysql C API, the MYSQL_SET_CHARSET_NAME property is set to utf8 with mysql_options immediately after initializing the database handle, so that the connection character set is not explicitly specified with the SET NAMES statement, and the connection character set is reset to utf8 when reconnecting a broken long connection with mysql_ping;

For mysql PHP API, the total running time of PHP program at the same page level is short. After connecting to the database, the connection character set can be explicitly set once with SET NAMES statement; However, when using long connections, be careful to keep the connection unobstructed and explicitly reset the connection character set with the SET NAMES statement after disconnecting the reconnection.

Fully understand the difference between utf8 and utf8mb4 in mysql: https://www.ofstack.com/article/90037. htm

Summarize


Related articles: