Understand the difference between mysql SET NAMES and mysql of i _set_charset

  • 2020-05-12 06:21:44
  • OfStack

As far as possible, use mysqli_set_charset(mysqli:set_charset) instead of "SET NAMES". Of course, this is also covered in the PHP manual, but there is no explanation as to why.
Recently, several friends asked me this question. Why on earth?
With more people asking, I think I can write an blog to specifically introduce this part of the content.
First of all, a lot of people don't know what SET NAMES is doing,
In my previous article, I went deep into the setting of MySQL character set and introduced the three "environment variables" of character_set_client/character_set_connection/character_set_results. Here is a brief introduction.
These three variables tell the MySQL server, the client's encoding set, the encoding set at the time of transmission to the MySQL server, and the encoding set of the results expected to be returned by MySQL.
For example, by using "SET NAMES utf8," you can tell the server that I'm using the utf-8 code, and I'd like you to return me the utf-8 code as well.
In general, "SET NAMES" is sufficient and correct. So why does the manual recommend mysqli_set_charset(PHP > = 5.0.5)?
First, let's take a look at what mysqli_set_charset actually does (note the asterisk comment, mysql_set_charset is similar):
 
  //php-5.2.11-SRC/ext/mysqli/mysqli_nonapi.c line 342 
  PHP_FUNCTION(mysqli_set_charset) 
  { 
  MY_MYSQL *mysql; 
  zval *mysql_link; 
  char *cs_name = NULL; 
  unsigned int len; 
  if (zend_parse_method_parameters(ZEND_NUM_ARGS() TSRMLS_CC, getThis() 
  , "Os", &mysql_link, mysqli_link_class_entry, &cs_name, &len) == FAILURE) { 
  return; 
  } 
  MYSQLI_FETCH_RESOURCE(mysql, MY_MYSQL*, &mysql_link, "mysqli_link" 
  , MYSQLI_STATUS_VALID); 
  if (mysql_set_character_set(mysql->mysql, cs_name)) { 
  //**  call libmysql Corresponding function of  
  RETURN_FALSE; 
  } 
  RETURN_TRUE; 
  } 

So what did mysql_set_character_set do?
 
  //mysql-5.1.30-SRC/libmysql/client.c, line 3166: 
  int STDCALL mysql_set_character_set(MYSQL *mysql, const char *cs_name) 
  { 
  struct charset_info_st *cs; 
  const char *save_csdir= charsets_dir; 
  if (mysql->options.charset_dir) 
  charsets_dir= mysql->options.charset_dir; 
  if (strlen(cs_name) < MY_CS_NAME_SIZE && 
  (cs= get_charset_by_csname(cs_name, MY_CS_PRIMARY, MYF(0)))) 
  { 
  char buff[MY_CS_NAME_SIZE + 10]; 
  charsets_dir= save_csdir; 
  /* Skip execution of "SET NAMES" for pre-4.1 servers */ 
  if (mysql_get_server_version(mysql) < 40100) 
  return 0; 
  sprintf(buff, "SET NAMES %s", cs_name); 
  if (!mysql_real_query(mysql, buff, strlen(buff))) 
  { 
  mysql->charset= cs; 
  } 
  } 
  // The following is omitted  

As we can see, mysqli_set_charset does one more step besides "SET NAMES" :
 
  sprintf(buff, "SET NAMES %s", cs_name); 
  if (!mysql_real_query(mysql, buff, strlen(buff))) 
  { 
  mysql->charset= cs; 
  } 

What does charset do for members of the core mysql structure?
The difference between mysql_real_escape_string() and mysql_escape_string is that it takes into account the "current" character set. So where does the current character set come from?
Well, you guessed it, mysql- > charset.
When mysql_real_string judges the characters of the wide character set, it USES different strategies according to this member variable. For example, if it is utf-8, libmysql/ ctype-utf8.c will be used.
As an example, the default mysql connection character set is latin-1, (classic 5c problem):
 
  <?php 
  $db = mysql_connect('localhost:3737', 'root' ,'123456'); 
  mysql_select_db("test"); 
  $a = "\x91\x5c";//" � " the gbk coding ,  The low byte of 5c,  That is ascii In the "\" 
  var_dump(addslashes($a)); 
  var_dump(mysql_real_escape_string($a, $db)); 
  mysql_query("set names gbk"); 
  var_dump(mysql_real_escape_string($a, $db)); 
  mysql_set_charset("gbk"); 
  var_dump(mysql_real_escape_string($a, $db)); 
  ?> 

This is because the low byte of the gbk encoder of the "wok" is 5c, which is the "\" in ascii, and because mysql(i)_set_charset affects mysql- > Other than charset, other moments mysql- > charset is the default, so the result is:
 
  $ php -f 5c.php 
  string(3) " � \" 
  string(3) " � \" 
  string(3) " � \" 
  string(2) " � " 

Is it clear to you now?

Related articles: