Solution of json_encode in php to deal with the problem of Chinese garbled codes in gbk and gb2312

  • 2021-07-07 06:42:46
  • OfStack

This paper describes the solution of json_encode in php to deal with the Chinese garbled code problem of gbk and gb2312. The specific methods are as follows:

1. json_encode () Chinese returns null for Chinese in gbk/gb2312


$arr = array ( 
 array ( 
   'catid' => '4', 
   'catname' => 'www.ofstack.com', 
   'meta_title' => ' Script House ' 
  )
);
 
echo json_encode($arr);

Run results:


[{"catid":"4","catname":"www.ofstack.com","meta_title":null}]

See 1 "meta_title": null he was originally a value for "this site", this we checked 1 principle is that json_encode only supports uft-8 coding, we convert 1


<?php
$data="JSON Chinese ";
$newData=iconv("GB2312 " ,"UTF-8//IGNORE",$data);
echo $newData;
//ignore Ignore the error during conversion, if there is no ignore Parameter, all characters after this character will not be saved. 
// Or ("GB2312 " ,"UTF-8 " ,$data);
?>

2. The background PHP page (page encoded as UTF-8 or having converted characters to UTF-8) uses json_encode to convert the array array in PHP to an JSON string. For example:


<?php
$testJSON=array('name'=>' Chinese string ','value'=>'test');
echo json_encode($testJSON);
?>

View the output as follows:


{"name":"u4e2du6587u5b57u7b26u4e32 " ,"value":"test"}

It can be seen that even characters encoded by UTF8 and json_encode also have Chinese garbled codes. The solution is to use the function urlencode () to process the characters for 1 time before using json_encode, then json_encode, and then use the function urldecode () to turn back the output result. The details are as follows:


<?php
$testJSON=array('name'=>' Chinese string ','value'=>'test');
//echo json_encode($testJSON);
foreach ( $testJSON as $key => $value ) {
$testJSON[$key] = urlencode ( $value );
}
echo urldecode ( json_encode ( $testJSON ) );
?>

View the output as follows:


{"name":" Chinese string ","value":"test"}

Summary: json_encode function can only deal with uft8 string, if it is estimated in Chinese, it is not good at byte processing, because the length of Chinese gbk and uft is not the same, so this will not be introduced in depth.


Related articles: