char in java accounts for several byte instance analyses

2020-06-15 09:09:22
OfStack

1: "byte" is byte, "bit" is bit;

2:1 byte = 8 bit;

char is 2 bytes in Java. java USES unicode, 2 bytes (16 bits) to represent 1 character.

Example code is as follows:


public class Test { 
 
 
  public static void main(String[] args) { 
    String str= " In the "; 
    char x =' In the '; 
    byte[] bytes=null; 
    byte[] bytes1=null; 
    try { 
      bytes = str.getBytes("utf-8"); 
      bytes1 = charToByte(x); 
    } catch (UnsupportedEncodingException e) { 
      // TODO Auto-generated catch block 
      e.printStackTrace(); 
    } 
    System.out.println("bytes  Size: "+bytes.length); 
    System.out.println("bytes1 Size: "+bytes1.length); 
  } 
  public static byte[] charToByte(char c) {  
    byte[] b = new byte[2];  
    b[0] = (byte) ((c & 0xFF00) >> 8);  
    b[1] = (byte) (c & 0xFF);  
    return b;  
  } 
}

Operation results:

bytes size: 3
bytes1 size: 2

java is unicode for characters, and unicode for "Zhong" is 2 bytes.

The String. getBytes(encoding) method gets the byte array representation of the specified encoding,

Usually gbk/gb2312 is 2 bytes and ES48en-8 is 3 bytes.

If encoding is not specified, the system default is encoding.

Thank you for reading, I hope to help you, thank you for your support to this site!