Details of file. encoding Settings in java

  • 2020-06-23 00:20:22
  • OfStack

Details of file. encoding Settings in java

Yesterday someone was talking about setting property on System, file.encoding to modify defaultcharset is invalid


Properties pps=System.getProperties(); 
pps.setProperty("file.encoding","ISO-8859-1"); 

In java, if charset is not specified, such as new String(byte[] bytes), the method Charset.defaultCharset () is called


public static Charset defaultCharset() { 
    if (defaultCharset == null) { 
    synchronized (Charset.class) { 
    java.security.PrivilegedAction pa = 
      new GetPropertyAction("file.encoding"); 
    String csn = (String)AccessController.doPrivileged(pa); 
    Charset cs = lookup(csn); 
    if (cs != null) 
      defaultCharset = cs; 
        else  
      defaultCharset = forName("UTF-8"); 
      } 
  } 
  return defaultCharset; 
  } 

We can clearly see that defaultCharset can only be initialized once, there is still a little bit of a problem here, in the case of multithreaded concurrent calls will still be initialized several times, of course, the rest is read from cache (lookup function), the problem is not too big.

When we change file.encoding in ES33en.getProperties, defaultCharset has already been initialized, so it will not call the original code.

When jvm is started, load class, defaultCharset is initialized before the last call to main function, and many functions like ES45en.getBytes and InputStreamReader, InputStreamWriter are called Charset.defaultCharset () without tracing who called defaultCharset first.

For defaultCharset, the language in jvm is the initial language at startup and cannot be changed. You can only modify charset or jvm by adding -Dfile. encoding=" UTF-8 "in the boot parameter.

digression

In Java, String is represented by char array, while java's char is different from c's char, java's char is double-byte, and char single byte in c is the same as Java byte

In other words, when we convert byte to string, it is according to charset decode to char, but when we call println,write string, we still need to output char to byte to the console or file.

When the c function write is finally called, if it is an byte array of java, it will be converted to an char array of c


(*env)->GetByteArrayRegion(env, bytes, off, len, (jbyte *)buf); 

Thank you for reading, I hope to help you, thank you for your support!


Related articles: