Chinese Character Code Sharing Based on PHP Decoding and unicode Coding

  • 2021-07-10 19:05:55
  • OfStack

Background of the problem:
In the evening, I grabbed the data of a website, and found such a string of coded data in the data packet: "......\ u65b0\ u6d6a\ u5fae\ u535a......", which is actually the data after Chinese was encoded by unicode. Now I just want to decode Chinese.

Solution:
Hehe, the foreigner is to give strength, and poke here to see the solution given by the foreigner

Scheme A (stable version + recommended):


function replace_unicode_escape_sequence($match) {
  return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
}
$name = '\u65b0\u6d6a\u5fae\u535a';
$str = preg_replace_callback('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', $name);
echo $str; // Output:   Sina Weibo  

// Let's put the above plan A To encapsulate ~~~( Program A Stable version + Upgrade + Recommend )
class Helper_Tool
{
  static function unicodeDecode($data)
  {  
    function replace_unicode_escape_sequence($match) {
      return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
    }  
 
    $rs = preg_replace_callback('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', $data);
 
    return $rs;
  }  
}
// Call 
$name = '\u65b0\u6d6a\u5fae\u535a';
$data = Helper_Tool::unicodeDecode($name); // Output Sina Weibo 


Scheme B (sub-recommended):


<?php
function unicodeDecode($name){
  $json = '{"str":"'.$name.'"}';
  $arr = json_decode($json,true);
  if(empty($arr)) return ''; 
  return $arr['str'];
}
$name = '\u65b0\u6d6a\u5fae\u535a';
echo unicodeDecode($name); // Output:   Sina Weibo  

For the scheme B, I want to focus on the following matters needing attention. With the technical support of my friend XAR (poke XAR blog), I summarize the string to be processed (that is, the content of the parameter $name passed to the function unicodeDecode must not contain single quotation marks, otherwise it will lead to parsing failure, so if necessary, you can format illegal characters into qualified characters with the help of str_replace () function)


Related articles: