Difference and Performance Comparison between Mybatis and Jpa

  • 2021-09-24 22:48:08
  • OfStack

Preface

These days, I heard from my friends that JPA is very easy to use, so I don't need to write sql at all. I'm thinking that a programmer can be called a programmer without writing sql. Moreover, the more advanced tools encapsulate the more tools, the lower the scalability and efficiency. Besides, I don't like things that are too encapsulated. I usually like to write sql by hand, so I use mybatis to write business. Then I found that jpa's saveAll () batch insertion batch update speed is too slow, which leads to 1 something imported with excel being very slow, which makes it necessary to open 1 asynchronous every time something that can be solved by synchronization. Personally, I feel that this practice is very bad. Because asynchronous actually does not affect the current business to do it in another time period, such as running timed tasks and updating incremental information asynchronously. There are many asynchronous packages in the code, that is to say, excel import is asynchronous, then jpa is slow, asynchronous includes asynchronous, the whole link is very long, and problems may occur for half a day.

Installing jpa and mybatis


<dependency>
	<groupId>org.mybatis.spring.boot</groupId>
	<artifactId>mybatis-spring-boot-starter</artifactId>
</dependency>
<dependency>
	<groupId>mysql</groupId>
	<artifactId>mysql-connector-java</artifactId>
</dependency>
<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>

These things only need to introduce xml of springboot as the parent class

Create 1 class


@Data
public class TestMybatis {
    private Long id;
    /**
     *  Domain account number 
     */
    private String userId;
    /**
     *  Principal metric 
     */
    private String mainMetric;
    /**
     *  Sub-metric 
     */
    private String subMetric;
    /**
     *  Measurement entry 
     */
    private String metricItem;
}
@SuppressWarnings("serial")
@javax.persistence.Entity
@javax.persistence.Table(name = "test")
@lombok.Data
public class TestJpa {
    @javax.persistence.Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    /**
     *  Domain account number 
     */
    private String userId;
    /**
     *  Principal metric 
     */
    private String mainMetric;
    /**
     *  Sub-metric 
     */
    private String subMetric;
    /**
     *  Measurement entry 
     */
    private String metricItem;
}
/**
 * @author Kakki
 * @version 1.0
 * @create 2021-06-17 17:39
 *  This is for Jpa Follow Mapper Almost 
 */
@Repository
public interface TestRee extends JpaRepository<TestRe, String> {

}

This is xml of mybatis


<insert id="insertList">
    insert into test(user_id,main_metric, sub_metric, metric_item) values
    <foreach collection="param" item="item" separator=",">
            (#{item.userId}, #{item.mainMetric}, #{item.subMetric}, #{item.metricItem})
    </foreach>
</insert>

Now let's look at the speed


@Slf4j
@RunWith(SpringRunner.class)
@SpringBootTest(classes = {ColaDemoApplication.class})
class ColaDemoApplicationTests {
    @Autowired
    private TestRee testRee;
    @Autowired
    private MetricMapper metricMapper;

    @Test
    void contextLoads() {
        List<TestJpa> jpaList = new ArrayList<>(1000);
        List<com.kakki.colademo.gatewayimpl.database.dataobject.TestMybatis> mybatisList = new ArrayList<>(1000);
        for (int i = 0; i < 1000; i++) {
            TestJpa testJpa = new TestJpa();
            testJpa.setMainMetric(String.format("mainMetric%d", i));
            testJpa.setSubMetric(String.format("subMetric%d", i));
            testJpa.setUserId(String.format("userId%d", i));
            testJpa.setMetricItem(String.format("metricItem%d", i));
            jpaList.add(testRe);
            com.kakki.colademo.gatewayimpl.database.dataobject.TestMybatis testMybatis = new com.kakki.colademo.gatewayimpl.database.dataobject.TestMybatis();
            testMybatis.setMainMetric(String.format("mainMetric%d", i));
            testMybatis.setSubMetric(String.format("subMetric%d", i));
            testMybatis.setUserId(String.format("userId%d", i));
            testMybatis.setMetricItem(String.format("metricItem%d", i));
            mybatisList.add(testR);
        }
        StopWatch jpa = new StopWatch();
        jpa.start();
        testRee.saveAll(jpaList);
        jpa.stop();
        log.info("[jpa]{}ms", jpa.getTotalTimeMillis());

        StopWatch m = new StopWatch();
        m.start();
        metricMapper.insertList(mybatisList);
        m.stop();
        log.info("[m]{}ms", m.getTotalTimeMillis());

    }

}

22:35:10.708 [main] INFO c.e.c.ColaDemoApplicationTests - [jpa]10576ms
22:35:31.366 [main] INFO c.e.c.ColaDemoApplicationTests - [m]138ms

It can be said that the difference is almost 10 times. This is only 1000 pieces of data. Let's try 10,000

22:36:48.505 [main] INFO c.e.c.ColaDemoApplicationTests - [jpa]8081ms
22:37:05.005 [main] INFO c.e.c.ColaDemoApplicationTests - [m]613ms
# Try 10w again
22:38:49.085 [main] INFO c.e.c.ColaDemoApplicationTests - [jpa]65710ms
22:39:09.844 [main] INFO c.e.c.ColaDemoApplicationTests - [m]9448ms

So can we see a big gap in this way? Why is there such a big gap? Let's look at the saveAll () source code


    @Transactional
	@Override
	public <S extends T> List<S> saveAll(Iterable<S> entities) {

		Assert.notNull(entities, "Entities must not be null!");

		List<S> result = new ArrayList<S>();

		for (S entity : entities) {
			result.add(save(entity));
		}

		return result;
	}
    @Transactional
	@Override
	public <S extends T> S save(S entity) {

		if (entityInformation.isNew(entity)) {
			em.persist(entity);
			return entity;
		} else {
			return em.merge(entity);
		}
	}

It can be seen from the above that one save goes in and save will also judge whether this primary key is empty, that is to say, n loops n and if judges, so the performance must be attenuated very much

Conclusion

I saw on the Internet that adding the following parameters can become batch, but it is useless for the author to try it at all. Maybe to solve this problem, he needs to rewrite his saveAll () method and insert or update it in pieces, so the performance will be much better.


spring.jpa.properties.hibernate.jdbc.batch_size=10000
spring.jpa.properties.hibernate.jdbc.batch_versioned_data=true
spring.jpa.properties.hibernate.order_inserts=true
spring.jpa.properties.hibernate.order_updates=true

Of course, today I only compare the performance of jpa with that of mybatis, but as a code farmer, I know that technology serves the business. Of course, Jpa also has its advantages. For example, if you create a method findAllByIdIn (List ids), you can directly get the list of queries under this condition, and findAllByOrderIdAndOrderType (String orderId, String orderType), which can be said to be very convenient, and you don't need to write sql again. He will complete your query operation automatically.

Summary

To develop a small project, the efficiency of Jpa is definitely higher than that of Mybatis. However, because the iterative update of business requirements is getting faster and faster, Jpa obviously cannot meet many things, and it is more difficult to maintain Sql than MyBatis. Therefore, I prefer Mybatis, and Sql is simpler and easier to maintain.


Related articles: