A brief discussion on character encoding conversion in JavaScript

  • 2020-07-21 06:51:26
  • OfStack

To get the Unicode encoding of a character, use the string. charCodeAt(index) method, which is defined as:


  strObj.charCodeAt(index)


index is the position of the specified character in the strObj object (an index based on 0) and returns a 16-bit integer between 0 and 65535. Such as:


   var strObj = "ABCDEFG";


   var code = strObj.charCodeAt(2); // Unicode value of character 'C' is 67


If there are no characters at the index specified by index, the return value is NaN.

To convert the Unicode encoding to 1 character, use the String.fromCharCode () method, noting that it is a "static method" of the String object, meaning you don't need to create a string instance before using it:



  String.fromCharCode(c1, c2, ...)


It accepts zero or more integers and returns a string containing the characters specified in each argument, such as:



var str = String.fromCharCode(72, 101, 108, 108, 111); // str == "Hello"


Discussion:


Unicode contains the character set for many of the world's written languages, but just because Unicode contains one character, don't expect it to display properly in a warning dialog, text box, or page rendering. If the character set is not available, it will appear as a question mark or some other symbol on the page. A typical North American computer will not be able to display Chinese characters on the screen unless the Chinese character set and its fonts are installed.


Related articles: