Use choices and pros and cons of mysqli_set_charset and SET NAMES

  • 2020-05-27 04:39:21
  • OfStack

Recently, the company organized a training on PHP security programming, which involved part 1 of Mysql's "SET NAMES" and mysql_set_charset (mysqli_set_charset) :
When it comes to mysqli_set_charset(mysqli:set_charset) instead of "SET NAMES", try to use mysqli_set_charset(mysqli:set_charset).

Recently, several friends asked me this question. Why on earth?
The more people ask, the more I think I can write an blog to specifically introduce this part of the content.
First of all, a lot of people don't know what SET NAMES is doing,
In my previous article, I went deep into the setting of MySQL character set and introduced the three "environment variables" of character_set_client/character_set_connection/character_set_results.
These three variables tell the MySQL server, the client's encoding set, the encoding set when transmitted to the MySQL server, and the expected encoding set of the results returned by MySQL.
For example, by using "SET NAMES utf8," you can tell the server that I'm using the utf-8 encoding, and I want you to return me the utf-8 encoding query results as well.

In general, "SET NAMES" is sufficient and correct. So why does the manual recommend mysqli_set_charset(PHP > = 5.0.5)?
First, let's take a look at what mysqli_set_charset actually does (note the asterisk comment, mysql_set_charset is similar):
//php-5.2.11-SRC/ext/mysqli/mysqli_nonapi.c line 342 
char *cs_name = NULL; 
unsigned int len; 
if (zend_parse_method_parameters(ZEND_NUM_ARGS() TSRMLS_CC, getThis() 
, "Os", &mysql_link, mysqli_link_class_entry, &cs_name, &len) == FAILURE) { 
MYSQLI_FETCH_RESOURCE(mysql, MY_MYSQL*, &mysql_link, "mysqli_link" 
if (mysql_set_character_set(mysql->mysql, cs_name)) { 
//**  call libmysql Corresponding function of  

What does mysql_set_character_set do?
//mysql-5.1.30-SRC/libmysql/client.c, line 3166: 
int STDCALLmysql_set_character_set(MYSQL*mysql, const char *cs_name) 
structcharset_info_st *cs; 
const char *save_csdir= charsets_dir; 
if (mysql->options.charset_dir) 
charsets_dir= mysql->options.charset_dir; 
if (strlen(cs_name) < MY_CS_NAME_SIZE && 
(cs= get_charset_by_csname(cs_name, MY_CS_PRIMARY, MYF(0)))) 
char buff[MY_CS_NAME_SIZE + 10]; 
charsets_dir= save_csdir; 
/* Skip execution of "SET NAMES" for pre-4.1 servers */ 
if (mysql_get_server_version(mysql) < 40100) 
return 0; 
sprintf(buff, "SET NAMES %s", cs_name); 
if (!mysql_real_query(mysql, buff, strlen(buff))) 
mysql->charset= cs; 
// The following is omitted  

As we can see, mysqli_set_charset does one more step besides "SET NAMES" :
sprintf(buff, "SET NAMES %s", cs_name); 
if (!mysql_real_query(mysql, buff, strlen(buff))) 
mysql->charset= cs; 

What does charset do for members of the core mysql structure?
The difference between mysql_real_escape_string() and mysql_escape_string is that it considers the "current" character set. So where does this character set come from?
Well, you guessed it, mysql- > charset.
When mysql_real_string judges the characters of the wide character set, it adopts different strategies according to this member variable. For example, if it is utf-8, libmysql/ ctype-utf8.c will be adopted.
In an example, the default mysql connection character set is latin-1, (classic 5c problem):
$db = mysql_connect('localhost:3737', 'root' ,'123456'); 
$a = "\x91\x5c";//" � " the gbk coding ,  The low byte of 5c,  That is ascii In the "\" 
var_dump(mysql_real_escape_string($a, $db)); 
mysql_query("set names gbk"); 
var_dump(mysql_real_escape_string($a, $db)); 
var_dump(mysql_real_escape_string($a, $db)); 

This is because the low byte of the gbk encoder of the "wok" is 5c, which is the "\" in ascii, and because mysql(i)_set_charset affects mysql- except mysql(i)_set_charset > Other than charset, other moments mysql- > charset is the default, so the result is:
$ php -f 5c.php 
string(3) " � \" 
string(3) " � \" 
string(3) " � \" 
string(2) " � " You know that by now ? 

Related articles: