String constant pools in Java are described in detail

  • 2020-04-01 03:40:49
  • OfStack

There are two forms of String object creation in Java, one is literal, such as String STR = "droid"; , another way is to use the standard new method of object construction, such as String STR = new String("droid"); , both of which we use a lot when writing code, especially literals. However, there are some performance and memory footprint differences between the two implementations. This all stems from the fact that the JVM, in order to reduce the repeated creation of string objects, maintains a special memory that is known as a string constant or literal pool.

The working principle of

When the code in the literal form to create a string object, the JVM will first examine the literal, if the string string objects of the same content in the constant pool of reference, will be the reference to return, a new string object is created, and then put the reference in a string constant pool, and returns the reference.

For example

Literal creation form


String str1 = "droid";

The JVM detects this literal, and here we think no object with content for droid exists. The JVM cannot find the existence of a string object with contents of droid through the string constant pool, so it creates the string object, then puts a reference to the newly created object into the string constant pool, and returns the reference to the str1 variable.

If I have a piece of code like this next


String str2 = "droid";

Once again, the JVM detects this literal. The JVM looks up the string constant pool, finds that the content is "droid" string object, and returns a reference to the existing string object to the variable str2. Note that no new string object is recreated here.

To verify that str1 and str2 are pointing to the same object, we can use this code


System.out.println(str1 == str2);

The result is true.

Create with new


String str3 = new String("droid");

When we use new to construct a string object, the new string object is created regardless of whether there is a reference to an object with the same content in the string constant pool. So let's test this out with the following code,


String str3 = new String("droid");
System.out.println(str1 == str3);

The result, as we expected, is false, indicating that the two variables point to different objects.

intern

For the string object created above with new, you can use the intern method if you want to add a reference to this object to the string constant pool.

After calling the intern, first check whether there is a reference to the object in the string constant pool, and if so, return the reference to the variable, otherwise add the reference and return it to the variable.


String str4 = str3.intern();
System.out.println(str4 == str1);

The output is true.

problems

Preconditions?

String constant pools are implemented on the premise that String objects in Java are immutable, so that multiple variables can safely share the same object. If a String object in Java is mutable, and a reference operation changes the value of the object, then other variables will also be affected, which is obviously not reasonable.

Reference or object

The most common problem is whether a reference is an object that is stored in a pool of string constants. The string constant pool holds object references, not objects. In Java, objects are created in heap memory.

Update validation, which has been discussed by many of the comments I've received, I simply verified. Verification environment:


22:18:54-androidyue~/Videos$ cat /etc/os-release
NAME=Fedora
VERSION="17 (Beefy Miracle)"
ID=fedora
VERSION_ID=17
PRETTY_NAME="Fedora 17 (Beefy Miracle)"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:fedoraproject:fedora:17" 22:19:04-androidyue~/Videos$ java -version
java version "1.7.0_25"
OpenJDK Runtime Environment (fedora-2.3.12.1.fc17-x86_64)
OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)

Verification idea: the following Java program reads a video file of 82M in size and performs the intern operation as a string.


22:01:17-androidyue~/Videos$ ll -lh | grep why_to_learn.mp4
-rw-rw-r--. 1 androidyue androidyue  82M Oct 20  2013 why_to_learn.mp4

The validation code


import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class TestMain {
  private static String fileContent;
  public static void main(String[] args) {
      fileContent = readFileToString(args[0]);
      if (null != fileContent) {
          fileContent = fileContent.intern();
          System.out.println("Not Null");
      }
  }
 
 
  private static String readFileToString(String file) {
      BufferedReader reader = null;
      try {
          reader = new BufferedReader(new FileReader(file));
          StringBuffer buff = new StringBuffer();
          String line;
          while ((line = reader.readLine()) != null) {
              buff.append(line);
          }
          return buff.toString();
      } catch (FileNotFoundException e) {
          e.printStackTrace();
      } catch (IOException e) {
          e.printStackTrace();
      } finally {
          if (null != reader) {
              try {
                  reader.close();
              } catch (IOException e) {
                  e.printStackTrace();
              }
          }
      }
      return null;
  }
}

Since the string constant pool exists in the permanent generation of heap memory, it is applicable before Java8. We verify this by setting the permanent substitution to a small value. If the string objects exist string constants in the pool, then must throw Java. Lang. OutOfMemoryError permgen space error.


java -XX:PermSize=6m TestMain ~/Videos/why_to_learn.mp4

Running the prover does not throw OOM, which is not a good proof of whether an object or a reference is stored.

But this at least proves that the actual content object of the string, char[], is not stored in the string constant pool. In this case, it doesn't really matter whether the string constant pool stores a string object or a reference to a string object. But individuals still tend to store references.

The advantages and disadvantages

The benefit of string constant pools is that they reduce the creation of strings of the same content, saving memory space.

If you want to say the drawback, is to sacrifice the CPU computation time in exchange for space. CPU computation time is primarily used to look in the string constant pool for references to objects with the same content. However, it is implemented internally as HashTable, so the computing cost is low.

The GC recovery?

Because the string constant pool holds references to Shared string objects, does this mean that these objects cannot be recycled?

First of all, the objects Shared in the problem are generally smaller. As far as I can verify, this problem did exist in earlier versions, but with the introduction of weak references, it should be gone by now.

About this problem, can understand this article concrete (link: http://mindprod.com/jgloss/interned.html#GC http://

Intern used?

The premise of using the intern is that you know you do need to use it. For example, we have a record of millions, and the value of a record is the state of California many times. We don't want to create millions of such string objects, we can use the intern to keep only one copy in memory. About the intern better understanding please reference (link: http://tech.meituan.com/in_depth_understanding_string_intern.html).

There are always exceptions, right?

Did you know that the following code creates several string objects and stores several references in the string constant pool?


String test = "a" + "b" + "c";

The answer is that only one object is created and only one reference is saved in the constant pool. Let's use javap to decompile and take a look.


17:02 $ javap -c TestInternedPoolGC
Compiled from "TestInternedPoolGC.java"
public class TestInternedPoolGC extends java.lang.Object{
public TestInternedPoolGC();
  Code:
   0:  aload_0
   1:  invokespecial    #1; //Method java/lang/Object."<init>":()V
   4:  return public static void main(java.lang.String[])   throws java.lang.Exception;
  Code:
   0:  ldc  #2; //String abc
   2:  astore_1
   3:  return

See, all three literals are actually combined into one at compile time. This is actually an optimization, avoiding the creation of redundant string objects and the occurrence of string concatenation problems. For string concatenation, see (link: #).


Related articles: