A brief analysis of caching and lazy loading mechanism in Hibernate framework of Java

  • 2020-04-01 04:26:33
  • OfStack

The difference between hibernate level 1 cache and level 2 cache
Cache is between the application and the physical data source, its role is to reduce the frequency of the application to the physical data source access, thus improving the performance of the application. The data in the cache is a copy of the data in the physical data source, the application reads and writes data from the cache at run time, and synchronizes the data in the cache with the physical data source at a particular time or event.
Cache media is usually memory, so reads and writes are fast. However, if the amount of data stored in the cache is very large, the hard disk will also be used as the cache medium. The implementation of the cache should consider not only the storage medium, but also the concurrent access to the managed cache and the life cycle of the cached data.
Hibernate caches include Session caches and SessionFactory caches, of which SessionFactory caches can be divided into two types: built-in caches and external caches. The cache of sessions is built in and cannot be unloaded, and is also known as Hibernate's first-level cache. The built-in cache of SessionFactory is similar to the cache implementation of Session. The former is the data contained in some collection properties of SessionFactory objects, while the latter is the data contained in some collection properties of Session. SessionFactory built-in cache to store the mapping metadata and predefined SQL statements, mapping metadata is a copy of the data in the mapping file, and predefined SQL statement is in Hibernate initialization phase, are derived according to the mapping metadata SessionFactory built-in cache is read-only, the application can't modify the mapping metadata caching and predefined SQL statements, so do not need to undertake the built-in SessionFactory cache and mapping file synchronization. The external cache of the SessionFactory is a configurable plug-in. By default, the SessionFactory does not enable this plug-in. External cache data is a copy of the database data, external cache media can be memory or hard disk. The external cache of the SessionFactory is also known as Hibernate's second level cache.
What is the difference between Hibernate's two levels of caching, which are both in the persistence layer and hold copies of database data? To understand the difference, you need to dig deeper into two features of the cache in the persistence layer: the scope of the cache and the concurrent access strategy for the cache.
The scope of the cache for the persistence layer
The scope of the cache determines the lifetime of the cache and who can access it. The scope of the cache falls into three categories.
Transaction scope: the cache can only be accessed by the current transaction. The life cycle of the cache depends on the life cycle of the transaction, and when the transaction ends, the cache ends its life cycle. Under this scope, the cache's medium is memory. Transactions can be database transactions or application transactions, each of which has its own cache, and the data in the cache is usually in the form of interrelated objects.
Process scope: the cache is Shared by all transactions within the process. These transactions are likely to access the cache concurrently, so the necessary transaction isolation mechanisms must be applied to the cache. The life cycle of the cache depends on the life cycle of the process, and when the process ends, the cache ends its life cycle. A process-wide cache can hold large amounts of data, so the storage medium can be either memory or hard disk. The data in the cache can be either in the form of an interconnected object or a loose data form of an object. The loose object data form is somewhat similar to the serialized data of an object, but the loose object decomposition algorithm is faster than the serialized object algorithm.
Cluster scope: in a clustered environment, the cache is Shared by processes on one or more machines. The data in the cache is copied to each process node in the cluster environment. The processes communicate with each other to ensure the consistency of the data in the cache. The data in the cache is usually in the form of loose data of objects.
For most applications, you should carefully consider whether you need to use a cluster-wide cache, because access is not necessarily much faster than direct access to database data.
The persistence layer can provide a wide range of caches. If the corresponding data is not found in the transaction-scope cache, you can also query it in the process-scope or cluster-scope cache, and if it is not found, you can only query it in the database. A transaction-scoped cache is the first level of the persistence layer, which is usually required; A process-wide or cluster-wide cache is the second level cache of the persistence layer and is usually optional.
Concurrent access policy for the cache of the persistence layer
Concurrency problems arise when multiple concurrent transactions simultaneously access the same data cached in the persistence layer, and the necessary transaction isolation measures must be taken.
Concurrency problems occur in process-wide or cluster-wide caches, known as level 2 caches. Therefore, you can set the following four types of concurrent access policies, each of which corresponds to a transaction isolation level.
Transactional: applicable only in managed environments. This provides Repeatable Read transaction isolation level. For data that is often read but rarely modified, this isolation type can be used because it prevents concurrency problems such as dirty reads and unrepeatable reads.
Read-write type: provides the Read Committed transaction isolation level. Only works in a non-clustered environment. For data that is often read but rarely modified, this isolation type can be used because it prevents concurrency problems such as dirty reads.
Non-strict read-write: the consistency of the cache with the data in the database is not guaranteed. If it is possible for two transactions to access the same data in the cache at the same time, you must configure a very short data expiration time for that data to minimize dirty reads. This concurrent access strategy can be used for data that is rarely modified and allows for occasional dirty reads. Read-only: this concurrent access strategy can be used for data that is never modified, such as reference data.
The transactional concurrent access policy is the highest level of transaction isolation and the lowest level of read-only isolation. The higher the transaction isolation level, the lower the concurrency performance.
What kind of data is suitable for the second level cache?
1. Data rarely modified
2, not very important data, allow the occasional concurrent data
3. Data that will not be accessed concurrently
4. Reference data
Data not suitable for level 2 cache?
1. Frequently modified data
2. Financial data, concurrency is absolutely not allowed
3. Data Shared with other applications.
Hibernate's second level cache
As mentioned earlier, Hibernate provides two levels of caching, the first being the caching of sessions. Since the lifetime of the Session object typically corresponds to a database transaction or an application transaction, its cache is a transaction-wide cache. Level 1 caching is required, not allowed, and in fact impossible to dismount. In the first-level cache, each instance of the persistent class has a unique OID.
The second level cache is a pluggable cache plug-in managed by the SessionFactory. Since the life cycle of the SessionFactory object corresponds to the entire process of the application, the second level cache is a process-wide or cluster-wide cache. This cache holds loose data for objects. The second-level objects have the potential for concurrency problems, so an appropriate concurrent access policy is required, which provides a transaction isolation level for cached data. The cache adapter is used to integrate the concrete cache implementation software with Hibernate. The second level cache is optional and can be configured at the granularity of each class or collection.
The general process of Hibernate's second-level caching strategy is as follows:
1) when the condition query, always issue a select * from table_name where... SQL statements such as. (select all fields) query the database to get all the data objects at once.
2) put all acquired data objects into the second level cache according to the ID.
3) when Hibernate accesses the data object according to the ID, it first checks from the Session level cache; No. If a level 2 cache is configured, look it up from the level 2 cache. Unable to find, and then query the database, the results by ID into the cache.
4) update the cache when deleting, updating, and adding data.
Hibernate's second-level caching strategy, which is for ID queries, has no effect on conditional queries. To do this, Hibernate provides a Query cache for conditional queries.
The process of Hibernate's Query caching strategy is as follows:
1) Hibernate first forms a Query Key according to these information, which includes the general information of the request of the conditional Query: SQL, parameters required by SQL, record range (starting position rowStart, maxRows maximum number of records), etc.
2) Hibernate searches the corresponding result list in the Query cache according to the Query Key. Returns the list of results if it exists; If it doesn't exist, Query the database, get the list of results, and put the entire list of results into the Query cache based on the Query Key.
3) the SQL in the Query Key involves some table names. If any data of these tables is modified, deleted, added, etc., the relevant Query Key should be cleared from the cache.

Hibernate lazy loading mechanism
Lazy loading:

    The lazy loading mechanism was proposed to avoid some unnecessary performance overhead. The so-called lazy loading is that the data is loaded when the data is really needed. Lazy loading of entity objects and lazy loading of collections is provided in Hibernate, and lazy loading of properties is also provided in Hibernate3. Let's go through the details of each of these types of lazy loading.

A. lazy loading of entity objects:

If you want to use lazy loading on entity objects, you must configure it in the mapping profile of the entity, as shown below:


<hibernate-mapping> 

<class name= " com.neusoft.entity.User "  table= " user "  lazy= " true " > 

   ...  

</class> 

</hibernate-mapping> 

Lazy loading of entities is enabled by setting the lazy property of class to true. If we run the following code:


User user=(User)session.load(User.class, " 1 " ); 

(1)


System.out.println(user.getName()); 

(2)

When running at (1), Hibernate did not launch a query for the data. If we looked at the memory snapshot of the user object at this time through some debugging tool (such as JBuilder2005's Debug tool), we would be surprised to find that the object returned might be of type user $EnhancerByCGLIB$$bede8986, and its property is null. Remember I talked earlier about the session.load() method, which returns the proxy class object for the entity object, and the type of object returned here is the proxy class object for the User object. In Hibernate, CGLIB is used to dynamically construct a proxy class object of the target object, and the proxy class object contains all the properties and methods of the target object, and all the properties are assigned to null. From the memory snapshot shown by the debugger, we can see that the real User object at this time is contained in the CGLIB$calback_0.target attribute of the proxy object. When the code runs to (2), the method user.getname () is called. At this time, through the callback mechanism given by CGLIB, the method CGLIB$calback_0.getname () is actually called. Hibernate will first check whether the CGLIB$calback_0. target attribute is null. If it is not null, it will call the getName method of the target object. To query the data, construct the target object, and assign it to the CGLIB$calback_0.target attribute.

      In this way, Hibernate implements lazy loading of entities through an intermediate proxy object, and only when the user actually initiates the action to obtain the attributes of entity objects will the database query operation be initiated. So lazy loading of entities is done with an intermediate proxy class, so only the session.load() method makes use of entity lazy loading, because only the session.load() method returns the proxy class object of the entity class.

B,               Lazy loading of collection types:

In the lazy loading mechanism of Hibernate, in view of the application of Collection types, is the most significant, because it has the potential to performance is greatly improved, a lot of efforts on to Hibernate, including the JDK Collection independent implementation, we are in the one-to-many associations, defined to accommodate the Set Set of associations, not Java. Util. Set type or its subtypes, but net. Sf. Hibernate. Collections. The Set type, Hibernate implements lazy loading of collection types by using the implementation of custom collection classes. To use lazy loading for collection types, we must configure the parts about associations of our entity classes as follows:


<hibernate-mapping> 

  <class name= " com.neusoft.entity.User "  table= " user " > 

 ... .. 

<set name= " addresses "  table= " address "  lazy= " true "  inverse= " true " > 

<key column= " user_id " /> 

<one-to-many class= " com.neusoft.entity.Arrderss " /> 

</set> 

  </class> 

</hibernate-mapping> 

By putting a < Set> The lazy attribute of the element is set to true to enable lazy loading of the collection type. Let's look at the following code:


User user=(User)session.load(User.class, " 1 " ); 

Collection addset=user.getAddresses();   

  (1)


Iterator it=addset.iterator();   

                  (2)


while(it.hasNext()){ 

Address address=(Address)it.next(); 

System.out.println(address.getAddress()); 

} 

When the program is executed at (1), it will not launch a query to the associated data to load the associated data. Only when it is run at (2), the real data reading operation will start. At this time, Hibernate will find the qualified entity object according to the qualified data index in the cache.

Here we introduce a completely new concept - data index, next we will first take what is data index. When the collection type is cached in Hibernate, it is cached in two parts. First, the id list of all entities in the collection is cached, and then the id list of these entity objects is cached, which is called data index. When looking up the data index, if the corresponding data index is not found, then a select SQL will be executed to obtain the qualified data, and construct the entity object set and data index, and then return the entity object set, and entity object and data index into the Hibernate cache. On the other hand, if the corresponding data index is found, the list of ids is pulled from the data index, and then the corresponding entity is looked up in the cache according to the id, returned from the cache if found, and a select SQL query is launched if not. Here we see another problem that can have an impact on performance, which is the cache strategy for collection types. If we configure the collection type as follows:


<hibernate-mapping> 

  <class name= " com.neusoft.entity.User "  table= " user " > 

 ... .. 

<set name= " addresses "  table= " address "  lazy= " true "  inverse= " true " > 

<cache usage= " read-only " /> 

<key column= " user_id " /> 

<one-to-many class= " com.neusoft.entity.Arrderss " /> 

</set> 

  </class> 

</hibernate-mapping> 

Here we apply the < Cache usage = "read only" / > Configuration. If you use this strategy to configure the collection type, Hibernate will only cache the data indexes, not the entity objects in the collection. Configure us to run the following code as above:


User user=(User)session.load(User.class, " 1 " ); 

Collection addset=user.getAddresses();   

Iterator it=addset.iterator();        

while(it.hasNext()){ 

Address address=(Address)it.next(); 

System.out.println(address.getAddress()); 

} 

System.out.println( " Second query ..." ); 

User user2=(User)session.load(User.class, " 1 " ); 

Collection it2=user2.getAddresses(); 

while(it2.hasNext()){ 

Address address2=(Address)it2.next(); 

System.out.println(address2.getAddress()); 

} 

If you run this code, you get something like this:


Select * from user where id='1'; 

Select * from address where user_id='1'; 

Tianjin 

Dalian 

Second query ...  

Select * from address where id='1'; 

Select * from address where id='2'; 

Tianjin 

Dalian 

We saw that when the second query was executed, two queries were performed on the address table. Why is this? This is because when the first time after loading entities, depending on the type of collection of caching policy configuration, only to set the cache data index, rather than the collection of entity object caching, so again in the second loading entities, Hibernate to find the corresponding entity data index, but according to the data index, but can't find the corresponding entity in the cache, so Hibernate according to find data index has launched two select SQL query operation, this caused the performance of waste, how to avoid this situation? We must also specify a cache policy for entities in the collection type, so we configure the collection type as follows:


<hibernate-mapping> 

  <class name= " com.neusoft.entity.User "  table= " user " > 

 ... .. 

<set name= " addresses "  table= " address "  lazy= " true "  inverse= " true " > 

<cache usage= " read-write " /> 

<key column= " user_id " /> 

<one-to-many class= " com.neusoft.entity.Arrderss " /> 

</set> 

  </class> 

</hibernate-mapping> 

Hibernate will also cache the entities in the collection type, and if you run the above code again according to this configuration, you will get output similar to the following:


Select * from user where id='1'; 

Select * from address where user_id='1'; 

Tianjin 

Dalian 

Second query ...  

Tianjin 

Dalian 

There are no more SQL statements to query against the data index, because the entity objects stored in the collection type can be retrieved directly from the cache.

C,             Property lazy loading:

    In Hibernate3, a new feature, lazy loading of properties, has been introduced, which provides a powerful tool for obtaining high-performance queries. When we speak in front of the large data objects read, in the User object has a resume field, the field is a Java SQL. Clob type, contains the User resume information, when we load the object, we have to each time to load the field, regardless of whether or not we really need it, and read the large data object itself will bring a large performance overhead. In Hibernate2, we only in front of us talked about the surface performance of the granularity of the segment, to break down the User class, to solve the problem (please refer to the section of the paper), but in the Hibernate3 we can attribute lazy loading mechanism, to make us get only when we really need this field operation, ability to read the field data, we must configure our entity class as follows:


<hibernate-mapping> 

<class name= " com.neusoft.entity.User "  table= " user " > 

 ...  

<property name= " resume "  type= " java.sql.Clob "  column= " resume "  lazy= " true " /> 

  </class> 

</hibernate-mapping> 

Through the < Property> Lazy attribute of the element set true to open properties of lazy loading, in order to implement lazy loading properties in Hibernate3, use the Class enhancer to strengthen treatment entity Class Class files, through the increase of the intensifier will additional callback mechanism of logic, to join the entity Class, here we can see the lazy loading properties, or the use of an additional. CGLIB is an open source Apache project that manipulates the bytecode of Java classes to dynamically construct the required class objects from the bytecode. According to the above configuration, we run the following code:


String sql= " from User user where user.name='zx'  " ; 

Query query=session.createQuery(sql);  

  (1)


List list=query.list(); 

for(int i=0;i<list.size();i++){ 

User user=(User)list.get(i); 

System.out.println(user.getName()); 

System.out.println(user.getResume()); } 

  (2)

When executed at (1), the following SQL statement is generated:


Select id,age,name from user where name='zx'; 

At this time, Hibernate will retrieve the field data corresponding to all the non-lazily loaded attributes in the User entity. When executed at (2), the following SQL statement will be generated:


Select resume from user where id='1'; 

The actual read operation on the resume field data is initiated.


Related articles: