Efficiency comparison of several methods of MyBatis batch insertion
- 2021-11-14 05:32:04
- OfStack
1.1 Table Structure 1.2 Project Configuration File
1.3 Entity Class
2. Repeatedly execute a single insert statement 2.1 corresponding to mapper interface
2.2 Test Methods 3. foreach Splice SQL
3.1 Corresponding to mapper interface
3.2 Test Methodology 3.3 Execution Times4. Batch 4.1 rewriteBatchedStatements Parameter4.2 Batch Prepared4.3 Execution Times4.4 If the data is larger
STEP 5 Summarize
Preface
There are three main ways to batch data:
Execute a single insert statement repeatedly foreach Splice sql Batch processing
1. Prepare
Based on Spring Boot + Mysql, and in order to omit get/set, lombok is used, see pom. xml for details.
1.1 Table structure
id uses database self-increment.
DROP TABLE IF EXISTS `user_info_batch`;
CREATE TABLE `user_info_batch` (
`id` bigint(11) unsigned NOT NULL AUTO_INCREMENT COMMENT ' Primary key id',
`user_name` varchar(100) NOT NULL COMMENT ' Account name ',
`pass_word` varchar(100) NOT NULL COMMENT ' Login password ',
`nick_name` varchar(30) NOT NULL COMMENT ' Nickname ',
`mobile` varchar(30) NOT NULL COMMENT ' Mobile phone number ',
`email` varchar(100) DEFAULT NULL COMMENT ' Email address ',
`gmt_create` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT ' Creation time ',
`gmt_update` timestamp NULL DEFAULT NULL COMMENT ' Update time ',
PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC COMMENT 'Mybatis Batch';
1.2 Project Profile
Careful you may have found that the database url is followed by a paragraph rewriteBatchedStatements=true. What's the use? Don't worry, it will be introduced later.
# Database configuration
spring:
datasource:
url: jdbc:mysql://47.111.118.152:3306/mybatis?rewriteBatchedStatements=true
username: mybatis
password: password
driver-class-name: com.mysql.cj.jdbc.Driver
# mybatis
mybatis:
mapper-locations: classpath:mapper/*.xml
type-aliases-package: cn.van.mybatis.batch.entity
1.3 Entity Class
@Data
@Accessors(chain = true)
public class UserInfoBatchDO implements Serializable {
private Long id;
private String userName;
private String passWord;
private String nickName;
private String mobile;
private String email;
private LocalDateTime gmtCreate;
private LocalDateTime gmtUpdate;
}
1.4 UserInfoBatchMapper
public interface UserInfoBatchMapper {
/** Single insert
* @param info
* @return
*/
int insert(UserInfoBatchDO info);
/**
* foreach Insert
* @param list
* @return
*/
int batchInsert(List<UserInfoBatchDO> list);
}
1.5 UserInfoBatchMapper.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE mapper PUBLIC "-//mybatis.org//DTD Mapper 3.0//EN" "http://mybatis.org/dtd/mybatis-3-mapper.dtd">
<mapper namespace="cn.van.mybatis.batch.mapper.UserInfoBatchMapper">
<insert id="insert" parameterType="cn.van.mybatis.batch.entity.UserInfoBatchDO">
insert into user_info_batch (user_name, pass_word, nick_name, mobile, email, gmt_create, gmt_update)
values (#{userName,jdbcType=VARCHAR}, #{passWord,jdbcType=VARCHAR},#{nickName,jdbcType=VARCHAR}, #{mobile,jdbcType=VARCHAR}, #{email,jdbcType=VARCHAR}, #{gmtCreate,jdbcType=TIMESTAMP}, #{gmtUpdate,jdbcType=TIMESTAMP})
</insert>
<insert id="batchInsert">
insert into user_info_batch (user_name, pass_word, nick_name, mobile, email, gmt_create, gmt_update)
values
<foreach collection="list" item="item" separator=",">
(#{item.userName,jdbcType=VARCHAR}, #{item.passWord,jdbcType=VARCHAR}, #{item.nickName,jdbcType=VARCHAR}, #{item.mobile,jdbcType=VARCHAR}, #{item.email,jdbcType=VARCHAR}, #{item.gmtCreate,jdbcType=TIMESTAMP}, #{item.gmtUpdate,jdbcType=TIMESTAMP})
</foreach>
</insert>
</mapper>
1.6 Preliminary data
For the convenience of testing, several variables are pulled away and loaded in advance.
private List<UserInfoBatchDO> list = new ArrayList<>();
private List<UserInfoBatchDO> lessList = new ArrayList<>();
private List<UserInfoBatchDO> lageList = new ArrayList<>();
private List<UserInfoBatchDO> warmList = new ArrayList<>();
// Counting tool
private StopWatch sw = new StopWatch();
In order to assemble data conveniently, a public method is extracted.
private List<UserInfoBatchDO> assemblyData(int count){
List<UserInfoBatchDO> list = new ArrayList<>();
UserInfoBatchDO userInfoDO;
for (int i = 0;i < count;i++){
userInfoDO = new UserInfoBatchDO()
.setUserName("Van")
.setNickName(" Dust blog ")
.setMobile("17098705205")
.setPassWord("password")
.setGmtUpdate(LocalDateTime.now());
list.add(userInfoDO);
}
return list;
}
Preheating data
@Before
public void assemblyData() {
list = assemblyData(200000);
lessList = assemblyData(2000);
lageList = assemblyData(1000000);
warmList = assemblyData(5);
}
2. Execute a single insert statement repeatedly
Maybe'lazy 'programmers will do this. It is very simple to nest an for loop directly on the original single insert statement.
2.1 Corresponding to mapper interface
int insert(UserInfoBatchDO info);
2.2 Test Methods
Because this method is too slow, the data is reduced to 2000
@Test
public void insert() {
log.info(" [Program warm-up] ");
for (UserInfoBatchDO userInfoBatchDO : warmList) {
userInfoBatchMapper.insert(userInfoBatchDO);
}
log.info(" [Warm-up is over] ");
sw.start(" Execute a single insert statement repeatedly ");
// Insert here 20w The bar was too slow, so I only inserted it 2000 Article
for (UserInfoBatchDO userInfoBatchDO : lessList) {
userInfoBatchMapper.insert(userInfoBatchDO);
}
sw.stop();
log.info("all cost info:{}",sw.prettyPrint());
}
2.3 Execution time
1st time
# Database configuration
spring:
datasource:
url: jdbc:mysql://47.111.118.152:3306/mybatis?rewriteBatchedStatements=true
username: mybatis
password: password
driver-class-name: com.mysql.cj.jdbc.Driver
# mybatis
mybatis:
mapper-locations: classpath:mapper/*.xml
type-aliases-package: cn.van.mybatis.batch.entity
0
Second time
-----------------------------------------
ms % Task name
-----------------------------------------
64853 100% Execute a single insert statement repeatedly
The 3rd time
-----------------------------------------
ms % Task name
-----------------------------------------
58235 100% Execute a single insert statement repeatedly
This method inserts 2000 pieces of data, and the average time of executing 3 times is 60991 ms.
3. foreach splicing SQL
3.1 Corresponding to mapper Interface
# Database configuration
spring:
datasource:
url: jdbc:mysql://47.111.118.152:3306/mybatis?rewriteBatchedStatements=true
username: mybatis
password: password
driver-class-name: com.mysql.cj.jdbc.Driver
# mybatis
mybatis:
mapper-locations: classpath:mapper/*.xml
type-aliases-package: cn.van.mybatis.batch.entity
3
3.2 Test Methods
This method and the next one are tested with 20w data.
# Database configuration
spring:
datasource:
url: jdbc:mysql://47.111.118.152:3306/mybatis?rewriteBatchedStatements=true
username: mybatis
password: password
driver-class-name: com.mysql.cj.jdbc.Driver
# mybatis
mybatis:
mapper-locations: classpath:mapper/*.xml
type-aliases-package: cn.van.mybatis.batch.entity
4
3.3 Execution Time
1st time
# Database configuration
spring:
datasource:
url: jdbc:mysql://47.111.118.152:3306/mybatis?rewriteBatchedStatements=true
username: mybatis
password: password
driver-class-name: com.mysql.cj.jdbc.Driver
# mybatis
mybatis:
mapper-locations: classpath:mapper/*.xml
type-aliases-package: cn.van.mybatis.batch.entity
5
Second time
# Database configuration
spring:
datasource:
url: jdbc:mysql://47.111.118.152:3306/mybatis?rewriteBatchedStatements=true
username: mybatis
password: password
driver-class-name: com.mysql.cj.jdbc.Driver
# mybatis
mybatis:
mapper-locations: classpath:mapper/*.xml
type-aliases-package: cn.van.mybatis.batch.entity
6
The 3rd time
-----------------------------------------
ms % Task name
-----------------------------------------
19827 100% foreach Splice sql
This method inserts 20w pieces of data, and the average time of executing 3 times is 18852 ms.
4. Batch processing
In this way, mapper and xml reuse 2.1.
4.1 rewriteBatchedStatements Parameters
At the beginning of Test 1, I found that the submission method changed to Mybatis Batch didn't work. In fact, when inserting, it was still inserting one record, and the speed was far lower than the original method of splicing SQL, which was very unscientific.
Later, it was discovered that for batch execution, a new parameter needs to be added to the concatenation URL string: rewriteBatchedStatements=true
Introduction of rewriteBatchedStatements ParametersrewriteBatchedStatements parameters should be added to JDBC of MySql and url connected, and drivers of version 5.1. 13 or above should be guaranteed to realize high-performance batch insertion. By default, MySql JDBC driver will ignore executeBatch () statement, break up a group of sql statements that we expect to execute in batches, and send them to MySql database one by one. Batch insertion is actually a single insertion, which directly causes low performance. Only when the rewriteBatchedStatements parameter is set to true, the driver will help you execute SQL in batches. This option works for both INSERT/UPDATE/DELETE.
4.2 Batch Preparation
Manual injection of SqlSessionFactory
@Resource
private SqlSessionFactory sqlSessionFactory;
Test code
# Database configuration
spring:
datasource:
url: jdbc:mysql://47.111.118.152:3306/mybatis?rewriteBatchedStatements=true
username: mybatis
password: password
driver-class-name: com.mysql.cj.jdbc.Driver
# mybatis
mybatis:
mapper-locations: classpath:mapper/*.xml
type-aliases-package: cn.van.mybatis.batch.entity
9
4.3 Execution Time
1st time
-----------------------------------------
ms % Task name
-----------------------------------------
09346 100% Batch execution Insert
Second time
-----------------------------------------
ms % Task name
-----------------------------------------
08890 100% Batch execution Insert
The 3rd time
-----------------------------------------
ms % Task name
-----------------------------------------
09042 100% Batch execution Insert
In this way, 20w pieces of data are inserted, and the average time of executing 3 times is 9092 ms.
4.4 If the data is larger
When I expanded the data to 100w, the way foreach spliced sql could no longer complete the insert, so I had to test the batch insert time.
When testing, you only need to cut list in the "4.2" test code into lageList test.
1st time
-----------------------------------------
ms % Task name
-----------------------------------------
32419 100% Batch execution Insert
Second time
-----------------------------------------
ms % Task name
-----------------------------------------
31935 100% Batch execution Insert
The 3rd time
-----------------------------------------
ms % Task name
-----------------------------------------
33048 100% Batch execution Insert
This method inserts 100w pieces of data, and the average time of executing 3 times is 32467 ms.
STEP 5 Summarize
批量插入方式 | 数据量 | 执行3次的平均时间 |
---|---|---|
循环插入单条数据 | 2000 | 60991 ms |
foreach 拼接sql | 20w | 18852 ms |
批处理 | 20w | 9092 ms |
批处理 | 100w | 32467 ms |
"This article sample code"