Five tips for parsing and optimizing the MySQL insert method

  • 2020-05-19 06:05:17
  • OfStack

The maximum execution time of PHP was revised to 600, but it was still timed out. After checking the number of data bars inserted before the timeout, it would take 40 to 60 minutes for the insertion to be completed. It seems that the efficiency of the program writing is too low, so it needs to be optimized.
Test computer configuration:
CPU: AMD Sempron(tm) Processor
Memory: 1.5 G
The statement is as follows:


$sql = "insert into `test` (`test`) values ('$content')";
for ($i=1;$i<1000;$i++) {
mysql_query($sql);
}
mysql_unbuffered_query run 3 The execution times are: 
9.85321879387
9.43223714828
9.46858215332
mysql_query The execution time is: 
10.0020229816
9.61053204536
9.24442720413
 So far, I think the most efficient way is as follows: 
$sql = "insert into `test` (`test`) values ('$content')";
for ($i=1;$i<999;$i++) {
$sql .= ",('$content')";
}
mysql_query($sql);
 The execution time is: 
0.0323481559753
0.0371758937836
0.0419669151306

The speed of the INSERT statement
The time required to insert a record consists of the following factors, with the Numbers representing the approximate proportion:
Connection: (3)
Send query to server :(2)
Analysis query :(2)
Insert record :(1x record size)
Insert index :(1x index)
Closed: (1)
Regardless of the initial overhead of opening the table, each query running concurrently opens.
The size of the table slows index insertion at the rate of logN (B tree).
Some ways to speed up insertion:
· if you are inserting many rows from the same client at the same time, insert several rows at the same time using INSERT statements with multiple VALUE. This is faster (several times faster in some cases) than using a single-line INSERT statement. If you are adding data to a non-empty table, you can adjust the bulk_insert_buffer_size variable to make data insertion faster. See section 5.3.3, "server system variables."
· if you insert many rows from different clients, you can use the INSERT DELAYED statement to speed things up. See section 13.2.4, "INSERT syntax".
· with MyISAM, if there are no deleted rows in the table, rows can be inserted while the SELECT statement is running.
· LOAD DATA INFILE is used when loading a table from a text file. This is usually 20 times faster than using many INSERT statements.
· when the table has many indexes, it is possible to do more work to make LOAD DATA INFILE faster. Use the following process:

Optionally create tables with CREATE TABLE.
Execute FLUSH TABLES statement or command mysqladmin flush-tables.
Use myisamchk - keys - used = 0 - rq/path/to/db/tbl_name. This will remove all indexes from the table.
Insert the data into the table with LOAD DATA INFILE, which is fast because no index is updated.
If you only want to read the table later, use myisampack to compress it.
Use myisamchk - r - q/path/to/db/tbl_name recreating index. This creates an index tree in memory before writing to disk, and it is faster because it avoids a lot of disk searching. The resulting index tree is also perfectly balanced.
Execute the FLUSH TABLES statement or the mysqladmin flush-tables command.

Note that if you insert an empty MyISAM table, LOAD DATA INFILE can also perform the previous optimization; The main difference is that you can have myisamchk allocate more temporary memory for index creation than you would for the server to re-index when you execute the LOAD DATA INFILE statement.
Can also use ALTER TABLE tbl_name DISABLE KEYS myisamchk instead - keys - used = 0 - rq/path/to/db/tbl_name, Using ALTER TABLE tbl_name ENABLE KEYS myisamchk instead - r - q/path/to/db/tbl_name. You can also skip FLUSH TABLES this way.
· locking tables can speed up INSERT operations performed with multiple statements:
LOCK TABLES a WRITE;
INSERT INTO a VALUES (1,23),(2,34),(4,33);
INSERT INTO a VALUES (8,26),(6,29);
UNLOCK TABLES;
This improves performance because the index cache is flushed to disk only once after all INSERT statements are completed. The number of INSERT statements is the number of index cache refreshes. If you can insert all rows in one statement, no locking is required.
For transaction tables, BEGIN and COMMIT should be used instead of LOCK TABLES to speed insertion.
Locking will also reduce the overall time for multi-connection testing, although the maximum wait time will rise because they are waiting for locking. Such as:
Connection 1 does 1000 inserts
Connections 2, 3, and 4 do 1 insert
Connection 5 does 1000 inserts
If locking is not used, 2, 3, and 4 will be completed before 1 and 5. If you use locking, 2, 3, and 4 will probably not finish before 1 or 5, but the overall time should be about 40% faster.
INSERT, UPDATE, and DELETE operations are fast in MySQL, and better overall performance can be achieved by locking operations that are inserted or updated continuously more than about five times in a row. If you insert multiple times in 1 row, LOCK TABLES can be executed, followed immediately by UNLOCK TABLES(approximately per 1000 rows) to allow other threads to access the table. This also results in good performance.
INSERT loads data much more slowly than LOAD DATA INFILE, even using the above strategy.
· to get faster speeds for LOAD DATA INFILE and INSERT in MyISAM table, expand the key cache by adding key_buffer_size system variables.
INSERT grammar

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
       [INTO] tbl_name [(col_name,...)]
       VALUES ({expr | DEFAULT},...),(...),...
       [ ON DUPLICATE KEY UPDATE col_name=expr, ... ]

or

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
       [INTO] tbl_name
       SET col_name={expr | DEFAULT}, ...
       [ ON DUPLICATE KEY UPDATE col_name=expr, ... ]

or

INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
       [INTO] tbl_name [(col_name,...)]
       SELECT ...
       [ ON DUPLICATE KEY UPDATE col_name=expr, ... ]

1. Use of DELAYED
Use a deferred insert operation
The DELAYED adjuster applies to the INSERT and REPLACE statements. When the DELAYED insert arrives,
The server puts the data row into a queue and immediately returns a status message to the client, so that the client
The end can then proceed before the data table is actually inserted into the record. If the reader from the data
When data is read from a table, the data in the queue is held until there are no readers. Next server
Start inserting rows in the delayed row (delayed-row) queue. At the same time as the insert operation, the server
Also check if any new read requests are arriving and waiting. If so, the delayed row queue is suspended,
Allows the reader to continue. When there is no reader, the server again begins to insert delayed rows of data.
This process continues until the queue is empty.
A few points to note:
· INSERT DELAYED should be used only for INSERT statements that specify the list of values. Server ignored for INSERT DELAYED... DELAYED of SELECT statement.
· server ignored for INSERT DELAYED... DELAYED for ON DUPLICATE UPDATE statements.
· because the statement returns immediately before the row is inserted, you cannot use LAST_INSERT_ID() to get the AUTO_INCREMENT value. The AUTO_INCREMENT value may be generated by a statement.
· for SELECT statements, DELAYED lines are not visible until the lines are actually inserted.
· DELAYED is ignored in slave replication servers because DELAYED does not generate data in slave servers that is different from that of the primary server.
Note that currently the rows in the queue are only kept in storage until they are inserted into the table. This means that if you forcibly discontinue mysqld (for example, using kill-9)
Or if mysqld stops unexpectedly, all rows that were not written to disk will be lost.

2. Use of IGNORE
IGNORE is an extension of MySQL to the standard SQL. If there are duplicate keywords in the new table,
Or if a warning appears after STRICT mode is started, then IGNORE is used to control the operation of ALTER TABLE.
If IGNORE is not specified, when a repeat keyword error occurs, the copy operation is aborted and the previous step is returned.
If IGNORE is specified, only line 1 will be used for lines with duplicate keywords, and other conflicting lines will be deleted.
Also, fix the error value to get it as close to the correct value as possible.
insert ignore into tb(...) value(...)
So you don't have to check if it exists, you ignore it, you don't add it

3. Use of ON DUPLICATE KEY UPDATE
If you specify ON DUPLICATE KEY UPDATE and inserting a row results in a duplicate value in an UNIQUE index or PRIMARY KEY, the old row UPDATE is executed. For example, if the column a is defined as UNIQUE and contains a value of 1, the following two statements have the same effect:
mysql > INSERT INTO table (a,b,c) VALUES (1,2,3)
- > ON DUPLICATE KEY UPDATE c=c+1;

mysql > UPDATE table SET c=c+1 WHERE a=1;

If the row is inserted as a new record, the value of the affected row is 1; If the original record is updated, the value of the affected row is 2.
Note: if column b is also the only column, INSERT is equivalent to this UPDATE statement:
mysql > UPDATE table SET c=c+1 WHERE a=1 OR b=2 LIMIT 1;

If a=1 OR b=2 matches multiple row directions, only 1 row is updated. In general, you should try to avoid using the ON DUPLICATE KEY clause for tables with multiple 1-only keywords.

You can use the VALUES(col_name) function in the UPDATE clause from INSERT... The INSERT part of the UPDATE statement refers to the column value. In other words, VALUES(col_name) in the UPDATE clause can reference the value of the inserted col_name if there is no duplicate keyword conflict. This function is especially useful for multi-row inserts. VALUES() function only in INSERT... The UPDATE statement makes sense, and other times it returns NULL.
Example:

mysql> INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
       -> ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
 This statement ACTS the same as the following two statements: 
mysql> INSERT INTO table (a,b,c) VALUES (1,2,3)
       -> ON DUPLICATE KEY UPDATE c=3;
mysql> INSERT INTO table (a,b,c) VALUES (4,5,6)
       -> ON DUPLICATE KEY UPDATE c=9;

When you use ON DUPLICATE KEY UPDATE, the DELAYED option is ignored.


Related articles: