Dig into Java garbage collection

  • 2020-04-01 02:01:36
  • OfStack

1. Core idea of garbage collection algorithm
The Java language establishes a garbage collection mechanism to keep track of objects in use and to find and recycle objects that are no longer in use. This mechanism can effectively guard against two possible dangers in dynamic memory allocation: memory exhaustion caused by excessive memory garbage, and memory illegal reference caused by improper memory release.

The core idea of garbage collection algorithm is: The available memory space of the virtual machine, that is, the objects in the heap space, is identified. If an object is being referenced, it is called a living object; conversely, if the object is no longer referenced, it is a garbage object, which can reclaim the space it occupies for redistribution. The selection of garbage collection algorithm and the reasonable regulation of garbage collection system parameters directly affect the system performance, so it is necessary for developers to have a deeper understanding.

2. Conditions that trigger the main Garbage Collector
The JVM does a high number of GC cycles, but because the GC takes a very short time, it has little impact on the system. More interesting is the trigger condition for the main GC because it has a significant impact on the system. In general, there are two conditions that trigger the main GC:

(1) when the application is idle, i.e. there are no application threads running, the GC will be called. Because GC occurs in the lowest-priority thread, the GC thread is not called when the application is busy, except for the following conditions.

(2) GC is called when the Java heap is out of memory. When an application thread is running and creates a new object while it is running, the JVM forces the GC thread to call if there is not enough memory to reclaim the memory for the new allocation. If the memory allocation requirement is not met after one GC, the JVM will make two more GC attempts. If the requirement is not met, the JVM will report an "out of memory" error and the Java application will stop.

Since it is up to the JVM to decide whether or not to run the main GC based on the system environment, which is constantly changing, the running of the main GC is uncertain and it is impossible to predict when it will necessarily occur, but it is certain that for a long-running application, the main GC is repeated.

3. Measures to reduce GC overhead
According to the above GC mechanism, the running of the program will directly affect the change of the system environment, thus affecting the GC trigger. If you don't design and code for the characteristics of GC, a number of negative effects such as memory resident can occur. To avoid these effects, the basic principle is to minimize garbage and GC overhead as much as possible. Specific measures include the following:

(1) do not explicitly call system.gc ()
This function advises the JVM to do the main GC, which is a recommendation rather than a given, but in many cases it triggers the main GC, increasing the frequency of the main GC, which in turn increases the number of intermittent pauses. In particular, the call to system.gc () shown in the code is not necessarily capable of gc, which we can verify with the finalize() method, which actively calls system.gc () and does not necessarily call the finalize() method every time. Finalize () methods are characterized by the finalize() method being called first before the object is reclaimed.

(2) minimize the use of temporary objects
Temporary objects will become garbage after jumping out of the function call. Using less temporary variables is equivalent to reducing the generation of garbage, thus prolongating the occurrence of the second trigger condition mentioned above and reducing the chance of the main GC.

(3) it is best to explicitly set the object to Null when not in use
In general, objects that are Null are treated as garbage, so explicitly setting objects that are not in use to Null helps the GC collector determine garbage, thereby increasing the efficiency of GC.

(4) try to use stringbuffers instead of strings to accumulate strings (see another blog post on String and StringBuffer in JAVA)
Because the String is fixed long String objects, accumulative String objects, not in a String object amplification, but to create a new String object, such as Str5 = Str1 + Str2 + + Str4 Str3, during the implementation of this statement will produce more garbage objects, because the time for operation on the "+" must create a new String object, but the transitional object for the system is of no practical significance, will only add more junk. To avoid this, a StringBuffer can be used to accumulate strings instead, because StringBuffer is of variable length and expands on the original without producing intermediate objects.

(5) use primitive types such as Int,Long, instead of Integer,Long object
Primitive type variables take up much less memory than the corresponding object, and are best used if not necessary. When do I need to use an Integer?

(6) use as few static object variables as possible
Static variables are global variables and will not be recycled by the GC. They will remain in memory.

(7) scatter the time of object creation or deletion
Focusing on creating a large number of new objects in a short period of time, especially large objects, can lead to a sudden need for a large amount of memory, and the JVM, faced with this situation, can only do the main GC to reclaim memory or consolidate memory fragmentation, thus increasing the frequency of the main GC. The same goes for deleting objects centrally. It makes the sudden appearance of a large number of garbage objects inevitably reduce the free space, thus greatly increasing the chance of forcing the main GC the next time a new object is created.

4. Garbage collection algorithm
(1) reference counting collector
Reference counting is an early strategy for garbage collection. In this method, each object in the heap has a reference count. When an object is created and a reference to that object is assigned to a variable, the reference count for that object is set to 1. For example, create A new object A =new A(); And then a is assigned to another variable b, which is b is equal to a; So the reference count of object a is +1. When any other variable is assigned a reference to this object, the count is added by 1. When an object's reference expires or is set to a new value, the reference count of the object is reduced by 1. For example, if b=c, the reference count of a is -1. Any object with a reference count of 0 can be garbage collected. When an object is garbage collected, the count of any objects it references is reduced by 1. In this approach, garbage collection of one object can lead to subsequent garbage collection of other objects. For example, A A =new A(); B = a; When b is garbage collected, the reference count of a goes to zero, causing a to be garbage collected as well.

Benefits of the method: The reference counting collector can execute quickly, interweaving the program as it runs. This reminder is useful for real-time environments where the program cannot be interrupted for long periods of time.

Disadvantages of the method: Reference counts cannot detect loops (two or more objects referencing each other). For example, a parent object has a reference to a child object, which in turn references the parent object. This makes it impossible for the object user to count to zero, even if they are no longer reachable by the root object of the executing program. Another disadvantage is that each increase or decrease in the reference count results in additional overhead.

(2) tracking collector
Garbage detection is usually done by creating a collection of root objects and checking for accessibility from those root objects. If there is a reference path between the root object that the executing program can access and an object, that object is reachable. The root object is always accessible to the program. Starting with these root objects, any object that can be touched is considered "active." Objects that cannot be touched are considered garbage because they do not affect the future execution of the program.

The trace collector traces an object reference graph from the root node. Objects encountered during tracing are marked in a hand manner. In general, you can either set the tag on the object itself or use a separate bitmap to set the tag. When tracing ends, unmarked objects are untouchable and can be collected.

The basic tracking algorithm is called "tag and sweep." The name refers to the two phases of a junk phone. During the marking phase, the garbage collector traverses the reference tree, marking each object encountered. In the cleanup phase, the unmarked object is freed, and the memory gained after the object is freed is returned to the executing program. In the Java virtual machine, the cleanup step must include the end of the object.


Related articles: