The solution of escape encoding and unescape decoding Chinese characters

  • 2020-03-30 03:31:06
  • OfStack

In today's project, we encountered a situation in which we needed to use javascript to encode Chinese characters and then decode them with unescape. When we were testing the code segment, we encountered some garbled codes.
The details are as follows:
First, open the test page test.html with EditPlus and edit the following HTML code:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>unescape test </title>
</head>
<body>
<script>
var teststr=escape(" The home of the script ");
document.write(teststr);
</script>
</body>
</html>

Page printout:


%uFFFD%u0171%uFFFD%u05AE%uFFFD%uFFFD

At this point you can see the situation is wrong, from the number of characters corresponding to the Chinese character alone has been wrong!
The following code is used to test the Chinese characters after unescape decoding:


var relstr=unescape("%uFFFD%u0171%uFFFD%u05AE%uFFFD%uFFFD");
document.write(relstr);

The garbled code appears: � The & # 369; The & # 65533; The & # 1454; The & # 65533; The & # 65533;

Solutions:
Use Dreamweaver to open the test.html file and find the problem!
The original part


var teststr=escape(" The home of the script ");

Turned out to be


var teststr=escape("ű֮");

This is caused by the initial coding of the editor!
In Dreamweaver, change the Chinese character back, rerun test.html, and get the corresponding code:


%u811A%u672C%u4E4B%u5BB6


Unescape is used to decode:


var relstr=unescape("%u811A%u672C%u4E4B%u5BB6");
document.write(relstr);

Related articles: