c method of getting the number of bytes of a string

  • 2020-06-03 08:08:53
  • OfStack

Converts the string into an ASCII encoded array, which is ASCII encoded 63 as long as it is Chinese bytecode. , so we can make a judgment from this

class StringOP
    {
        /// <summary>
        ///  Gets the actual length of the Chinese-English mixed string ( The number of bytes )
        /// </summary>
        /// <param name="str"> To get the length of the string </param>
        /// <returns> The actual length of the string (number of bytes) </returns>
        public int getStringLength(string str)
        {
            if (str.Equals(string.Empty))
                return 0;
            int strlen = 0;
            ASCIIEncoding strData = new ASCIIEncoding();
            // Converts a string to ASCII Encoded byte number 
            byte[] strBytes = strData.GetBytes(str); 
            for (int i = 0; i <= strBytes.Length - 1; i++)
            {
                if (strBytes[i] == 63)  // Chinese will be coded as ASCII coding 63, namely "?" No. 
                    strlen++;
                strlen++;
            }
            return strlen;
        }
    }

    class TestMain
    {
        static void Main()
        {
            StringOP sop = new StringOP();
            string str = "I Love China!I Love  Beijing! ";
            int iLen = sop.getStringLength(str);
            Console.WriteLine(" string " + str + " Number of bytes is: " + iLen.ToString());
            Console.ReadKey();
        }
    }

The number of bytes in the string is calculated by converting the string to an array of bytes, encoded in Unicode, and determining whether the second byte of each character is greater than 0

public static int bytelenght(string str)
        {
            // use Unicode Encode the string in a way that converts it to an array of bytes , It takes all the strings ( Including English and Chinese ) In all 2 Byte storage 
            byte[] bytestr = System.Text.Encoding.Unicode.GetBytes(str);
            int j = 0;
            for (int i = 0; i < bytestr.GetLength(0); i++)
            {
                // Take more than 2 Because all the elements in a byte array with double indices are unicode The first of the characters 1 bytes 
                if (i % 2 == 0)
                {
                    j++;
                }
                else
                {
                    // The singular subscript is the first character 2 bytes , if 1 The first character 2 Bytes to 0, Represents the Unicode Characters are English characters , Otherwise, it is Chinese character 
                    if (bytestr[i] > 0) 
                    {
                        j++;
                    }
                }
            }
            return j;
        }

Get length directly into bytecode:

byte[] sarr = System.Text.Encoding.Default.GetBytes(s);   
 int len = sarr.Length;

Related articles: