Summary of the use of iconv function for character set conversion in php

  • 2021-07-21 08:00:45
  • OfStack

iconv function library can complete the conversion between various character sets, and is an indispensable basic function library in php programming.

The usage is as follows:


$string = " Welcome to Script Home! ";
iconv("utf8","gbk",$string)// Will string string  Coded by utf8 Turn into gbk ;

Expand as follows:


echo $str= ' Welcome to Script House !';
echo '<br />';
echo iconv('GB2312', 'UTF-8', $str); // Change the encoding of a string from GB2312 Go to UTF-8
echo '<br />';
echo iconv_substr($str, 1, 1, 'UTF-8'); // Truncated by characters instead of bytes
print_r(iconv_get_encoding()); // Get the current page coding information
echo iconv_strlen($str, 'UTF-8'); // Get the string length of the set encoding
// There are also such uses
$content = iconv("UTF-8","gbk//TRANSLIT",$content);

Remarks:

1. iconv is not the default function of php, but also the default installed module. It needs to be installed before it can be used.

If it is windows2000 + php, you can modify the php. ini file and change the "before extension=php_iconv. dll"; "Remove, and at the same time you want iconv. dll from your original php installation file to your winnt/system32 (if your dll points to this directory, my local one does not operate this step)
In the linux environment, with static installation, add one more item-with-iconv when configure, and phpinfo can see the item of iconv. (Linux7.3 + Apache4.06 + php4.3. 2);

This is where the introduction of iconv in php is completed

2. Using iconv function to convert the captured utf-8 encoded page into gb2312, it is found that only using iconv function to transcode the captured data 1 will be less than 1 for no reason. The reason is this:


string iconv ( string in_charset, string out_charset, string str )

Note: The second parameter, in addition to specifying the encoding to be converted, can also be added with two suffixes://TRANSLIT and//IGNORE, where//TRANSLIT automatically changes characters that cannot be directly converted into one or more approximate characters, and//IGNORE ignores characters that cannot be converted, and the default effect is to truncate from the first illegal character. If it is truncated, of course it will be less;

iconv ("UTF-8", "GB2312//IGNORE", $string) can be modified to ignore characters that cannot be converted;

Additional:

Introduction of mb_convert_encoding and iconv Functions in PHP

The function mb_convert_encoding is used to transform encoding and is similar to the function iconv.

English 1 generally does not have coding problems, only Chinese data will have this problem. For example, when you write a program with Zend Studio or Editplus, you use gbk encoding. If the data needs to be entered into the database, and the database encoding is utf8, it is necessary to encode and convert the data at this time, otherwise it will become garbled when entering the database.

For the usage of mb_convert_encoding, see official:

http://cn.php.net/manual/zh/function.mb-convert-encoding.php

Make an GBK To UTF-8


<?php
header("content-Type: text/html; charset=Utf-8"); // Sets the character encoding to be utp-8
echo mb_convert_encoding(" Mine ", "UTF-8", "GBK");
?>

Another GB2312 To Big5


<?php
header("content-Type: text/html; charset=big5");
echo mb_convert_encoding(" Friend ", "big5", "GB2312");
?>

However, to use the above functions need to install but need to first enable mbstring extension library, reason or a sample, modify php. ini.
Another function in PHP, iconv, is also used to convert string encoding, which is similar to the function above.

Here are a few more detailed examples:


iconv - Convert string to requested character encoding
(PHP 4 >= 4.0.5, PHP 5)
mb_convert_encoding - Convert character encoding
(PHP 4 >= 4.0.6, PHP 5)

Usage:

string mb_convert_encoding ( string str, string to_encoding [, mixed from_encoding] )

The enable mbstring extension library is required first, which will be used in php. ini; extension=php_mbstring. dll preceding; Remove
mb_convert_encoding can specify a variety of input codes, which will be automatically recognized according to the content, but its execution efficiency is much worse than iconv, so why explain this mb_convert_encoding () function? The answer is as follows:

iconv was found to have an error converting the character "-" to gb2312. Without the ignore parameter, all strings following that character cannot be saved. In any case, this "-" cannot be converted successfully and cannot be output. In addition, mb_convert_encoding does not have this bug.

In general, iconv is used, and mb_convert_encoding function is used only when it is impossible to determine what the original code is, or when iconv can not be displayed normally after conversion.


Related articles: