Explanation of PHP iconv of Function Character Encoding Conversion

  • 2021-11-29 23:24:31
  • OfStack

In php, iconv function library can complete the conversion between various character sets, which is an indispensable basic function library in php programming; But sometimes iconv transcodes some data less for no reason. For example, an error occurs when converting the character "-" to gb2312.

Let's take a look at the usage of this function from the following one.

For the simplest application, replace gb2312 with utf-8:


$text=iconv("GB2312","UTF-8",$text);

In use $text=iconv("UTF-8","GB2312",$text) In the process, if you encounter some special characters, such as "-", "." in English names and so on, the conversion is broken. The text after these characters can't continue to be converted.

To solve this problem, you can use the following code:


$text=iconv("UTF-8","GBK",$text);

You're not mistaken. It's as simple as that. Instead of using gb2312, write it as GBK.

There is another way, the second parameter, plus //IGNORE Ignore the error, as follows:


iconv("UTF-8","GB2312//IGNORE",$data);

There was no specific comparison between the two methods, but the first method (GBK instead of gb2312) was perceived to be better.

iconv () in the php manual states:

iconv

(PHP 4 > = 4.0.5, PHP 5)
iconv-Convert string to requested character encoding
Description
string iconv ( string in_charset, string out_charset, string str )
Performs a character set conversion on the string str from in_charset to out_charset. Returns the converted string or FALSE on failure.
If you append the string //TRANSLIT to out_charset transliteration is activated. This means that when a character can't be represented in the target charset, it can be approximated through one or several similarly looking characters. If you append the string //IGNORE, characters that cannot be represented in the target charset are silently discarded. Otherwise, str is cut from the first illegal character.

When using this function for string encoding conversion, it should be noted that if utf-8 is converted to gb2312, the string may be truncated. At this point, you can use the following methods:


$str=iconv('utf-8',"gb2312//TRANSLIT",file_get_contents($filepath));

That is to say, add a red word part in the second parameter, which means that if no characters matching the source code can be found in the target code, similar characters will be selected for conversion. You can also use the parameter://IGNORE here to ignore characters that cannot be converted.

ignore means ignoring conversion errors, and without the ignore parameter, all strings following that character cannot be saved.

iconv is not the default function of php and is the default installed module. It needs to be installed before it can be used.

If it is windows2000 + php, you can modify the php. ini file and change the "before extension=php_iconv. dll"; "Remove, and at the same time you want copy from your original php installation file iconv. dll to your winnt/system32 (if your dll points to this directory). In the linux environment, with static installation, add one more item in configure-with-iconv, and phpinfo can see the items of iconv. (Linux7.3 + Apache4.06 + php4.3. 2).

Introduction of mb_convert_encoding and iconv Functions

mb_convert_encoding This function is used to convert codes. Originally, 1 didn't understand the concept of program coding, but now it seems to be a little enlightened. However, English 1 generally does not have coding problems, only Chinese data will have this problem. For example, when you write a program with Zend, Studio or Editplus, you use gbk coding. If the data needs to be entered into the database, and the code of the database is utf8, you should encode and convert the data at this time, otherwise it will become garbled when entering the database.

Make an GBK To UTF-8:


<?php 
header("content-Type: text/html; charset=Utf-8"); 
echo mb_convert_encoding(" You are my friend ", "UTF-8", "GBK"); 
?>

Another GB2312 To Big5:


<?php 
header("content-Type: text/html; charset=big5"); 
echo mb_convert_encoding(" You're my friend ", "big5", "GB2312"); 
?>

However, to use the above functions, you need to install the enable mbstring extension library first.

string mb_convert_encoding ( string str, string to_encoding [, mixed from_encoding] ) The enable mbstring extension library is required first, which will be used in php. ini; extension=php_mbstring. dll preceding; Removing mb_convert_encoding can specify a variety of input codes, which will be automatically recognized according to the content, but the execution efficiency is much worse than iconv;

string iconv ( string in_charset, string out_charset, string str ) Note: In addition to specifying the encoding to be converted, the second parameter can also be added with two suffixes://TRANSLIT and//IGNORE, where//TRANSLIT automatically changes characters that cannot be directly converted into one or more approximate characters, and//IGNORE ignores characters that cannot be converted, and the default effect is to truncate from the first illegal character.

In general, iconv is used only when it is impossible to determine what the original code is, or when iconv cannot be displayed normally after conversion mb_convert_encoding Function.


$content = iconv("GBK", "UTF-8 " , $content);
$content = mb_convert_encoding($content, "UTF-8 " , "

Summarize


Related articles: