Encoding functions in js: escape encodeURI and encodeURIComponent

  • 2021-08-05 08:23:04
  • OfStack

1. eacape (): This method does not encode ASCII letters and numbers, nor does it encode the following ASCII punctuation marks: * @ -_ +./. All other characters are replaced by escape sequences. In other cases, the coding results of escape, encodeURI and encodeURIComponent are the same.

escape outputs% u**** format when encoding unicode values other than 0-255

escape () encoded strings can be decoded using unescape ().

ECMAScript v3 opposes the use of this method and uses decodeURI () and decodeURIComponent () instead.

2. encodeURI and encodeURIComponent

Both encodeURI and encodeURIComponent are functions defined in the ECMA-262 standard and are implemented in all languages compatible with this standard (e.g. JavaScript, ActionScript). They are global functions used to encode URI (RFC-2396) strings, but they are handled differently and used in different scenarios. To explain their differences, we first need to understand the classification of characters in URI in RFC-2396:

1 > Reserved characters (reserved characters): These characters are reserved key characters in URI, which are used to divide various parts in URI. These characters are: "; "/"? "": "" @ "" & " | "=" | "+" | "$" | ","

2 > Mark Character (mark characters): This type of character is specifically defined in RFC-2396, but its purpose is not specified, and may be related to other RFC standards. These characters are: "-" "_" "." "! "~" "*" "" "" (")"

3 > Basic characters (alphanum characters): This type of character is the main part of URI and includes all uppercase, lowercase, and numeric letters.

After introducing the above three types of strings, it is very easy for us to explain the differences between encodeURI and encodeURIComponent functions:

encodeURI: This function escapes-encodes (escaping) all non-(basic, Mark, and reserved) characters in the incoming string. All characters to be escaped are converted to 1, 2, or 3 byte 106-ary escape characters (% xx) according to UTF-8 encoding. For example, the character space "" is converted to "% 20". Under this encoding mode, the ASCII characters to be encoded are replaced by 1 byte escape characters, the characters between\ u0080 and\ u007ff are replaced by 2 byte escape characters, and the other 16 Unicode characters are replaced by 3 byte escape characters.

encodeURIComponent: There is only one difference between this function and encodeURI, that is, escape encoding is also done for reserved characters. In this way, parameters and values in url will not be truncated by special characters such as #. For example: http://localhost: 8080/xss/XssServlet? username = A & T Plastic, the url, background code:


String username = request.getParameter("username");

The username value obtained is A, not the A we expected & T Plastic. Because username=A & T Plastic, which contains reserved characters & And is not encoded, so the value of username is truncated by it. So the right thing to do is to code it: encodeURIComponent ("A & T Plastic ") = = A% 26T% 20Plastic, then change the connection above to:

http://localhost: 8080/xss/XssServlet? username=A% 26T% 20Plastic for the background to get the correct value: username==A & T Plastic.

Because the value of username contains reserved characters of uri, encoding is required.

For example, the character ':' is replaced by the escaped character '% 3A'

The above two different functions are because we have two different coding requirements for URI when writing JS code. encodeURI can be used to encode a complete URI string. encodeURIComponent can encode one part of URI, so that this part can contain one URI reserved character. This is 10 points useful in our daily programming. For example, the following URI string:

http://www.mysite.com/send-to-friend.aspx?url=http://www.mysite.com/product.html

In this URI string. The send-to-friend. aspx page creates the message content in HTML format and contains a link whose address is the url value in the URI string above. Obviously, the above url value is a part of URI, which contains URI reserved key characters. We must call encodeURIComponent to encode it, otherwise the above URI string will be considered an invalid URI by the browser.

The correct URI should be as follows:

http://www.mysite.com/send-to-friend.aspx?url=http%3A%2F%2Fwww.mysite.com%2Fproduct.html

Maximum use should be encodeURIComponent, it will be Chinese, Korean and other special characters into utf-8 format url encoding, so if you need to use encodeURIComponent to pass parameters to the background, you need background decoding to support utf-8 (form form encoding and the current page encoding method is the same)

escape has 69 unencoded characters: *, +,-,.,/, @, _, 0-9, a-z, A-Z

encodeURI has 82 uncoded characters:! , #, $, & , ', (,), *, +,,,-,.,/,:,; , =,? , @, _, ~, 0-9, a-z, A-Z

encodeURIComponent has 71 uncoded characters:! , ', (,), *,-,., _, ~, 0-9, a-z, A-Z

Examples:


alert(encodeURIComponent("A&T Plastic")); //A%26T%20Plastic
alert(escape("A&T Plastic"));  //A%26T%20Plastic
alert(encodeURI("A&T Plastic"));  //A&T%20Plastic
alert(escape("A&T Plastic Medium "));  //A%26T%20Plastic%uFFFD%uFFFD

We see that encodeURI does not encode the reserved characters of uri & , 'Medium' is encoded as% uFFFD% uFFFD

encodeURIComponent encodes reserved characters & .

The coding of url is often used in XSS attacks to bypass xss filter on the server side, disguise the threatened url, and let unknown users click.

Therefore, if you only deal with the garbled problem in url address when get is submitted, you can use encodeURI to encode the whole url;

If the parameters contain reserved characters and need to be encoded, encodeURIComponent should be used to encode some parameters;

If you use encodeURIComponent to handle Chinese garbled code, the front end needs to use encodeURIComponent twice (encodeURIComponent ('Hello')), and the back end of Java uses:


java.Net.URLDecoder.decode(param,"UTF-8");

To decode;

Reference:

https://www.ofstack.com/article/22880.htm

Summarize


Related articles: