Example of using regular to calculate Chinese length in javascript

  • 2020-03-30 02:44:04
  • OfStack

Because javascript is unicode encoded, all characters are one for it, but the background program is not, usually in the background program a Chinese is occupied by two bytes, which leads to before and after the length of the end check is not consistent, this problem can be solved by regularization.


function getRealLen( str ) {
    return str.replace(/[^x00-xff]/g, '__').length; //This matches all the double bytes
}

Bonus tip:

Sometimes in order to beautiful, do not affect the layout and interface, will cut words in some document, but the width of the width of the Chinese and English, if cut according to English standards in Chinese, or according to Chinese standard section in English, will obviously long and short, especially the nicknames such easy thing, there are both Chinese and English we can use the same train of thought


function beautySub( str, len) {
       var reg = /[u4e00-u9fa5]/g,    //Major matching Chinese
           slice = str.substring(0,len),
           realen = len - ( ~~( slice.match(reg) && slice.match(reg).length ) );
           return slice.substring(0, realen ? realen : 1);
}

Here we think that a Chinese character is the width of the two characters in English, if you are a perfectionist, should think of j and w, m is not the same, the width of w and m and the width of the uppercase letters and Chinese is consistent, the function of regular and considerable room for improvement, at the same time you can also specify the starting position of the cutting word.


Related articles: