Jan 6, 2021

Very comprehensive MySQL handles duplicate data code

Some MySQL tables may have duplicate records. In some cases we allow duplicate data, but sometimes we need to delete duplicate data.

In this section, we will show you how to prevent duplicate data from appearing in a table and how to delete duplicate data from a table.

1. Prevent duplicate data in the table

You can ensure data uniqueness by setting the specified fields in the MySQL table to PRIMARY KEY (primary key) or UNIQUE (only one) indexes.

Let’s try an example: The table below has no index and no primary key, so it allows multiple duplicate records.

CREATE TABLE person_tbl
(
 first_name CHAR(20),
 last_name CHAR(20),
 sex CHAR(10)
);

If you want to set the table field first_name, last_name data cannot be duplicates, you can set the dual primary key mode to set the data uniqueness, if you set the dual primary key, then the default value of that key can not be set to NULL, can be set to NOT NULL. As shown below:

CREATE TABLE person_tbl
(
 first_name CHAR(20) NOT NULL,
 last_name CHAR(20) NOT NULL,
 sex CHAR(10),
 PRIMARY KEY (last_name, first_name)
);

If we had set the 1-only index, the SQL statement would not execute successfully and throw an error when inserting duplicate data.

The difference between INSERT IGNORE INTO and INSERT INTO is that INSERT IGNORE ignores data that already exists in the database, inserts new data if there is no data in the database, and skips data if there is. This preserves the data that already exists in the database for the purpose of inserting data in the gap.

The following example uses INSERT IGNORE INTO and does not execute with error or insert duplicate data into the data table:

mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
 -> VALUES( 'Jay', 'Thomas');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT IGNORE INTO person_tbl (last_name, first_name)
 -> VALUES( 'Jay', 'Thomas');
Query OK, 0 rows affected (0.00 sec)

INSERT IGNORE INTO When inserting data, no error is returned if duplicate data is inserted, only as a warning, after setting the record’s uniformity. For REPLACE, INTO and into, if there is a record same as primary or unique, delete it. Insert a new record.

Another unique way to set the data is to add an UNIQUE index, as shown below:

CREATE TABLE person_tbl
(
 first_name CHAR(20) NOT NULL,
 last_name CHAR(20) NOT NULL,
 sex CHAR(10)
 UNIQUE (last_name, first_name)
);

2. Count duplicate data

first_name and last_name repeat records in the following table:

mysql> SELECT COUNT(*) as repetitions, last_name, first_name
 -> FROM person_tbl
 -> GROUP BY last_name, first_name
 -> HAVING repetitions > 1;

The above query will return the number of duplicate records in the person_tbl table. 1 Under normal circumstances, to query duplicate values, please perform the following operations:

1. Determine which columns contain values that are likely to duplicate. 2, Use the columns listed in COUNT(*) in the column selection list. 3, The columns listed in the GROUP BY clause. 4, HAVING clause setting repeat number greater than 1.

3. Filtering duplicate data

If you need to read unduplicated data, use the DISTINCT keyword in the SELECT statement to filter duplicated data.

mysql> SELECT DISTINCT last_name, first_name
 -> FROM person_tbl
 -> ORDER BY last_name;

You can also use GROUP BY to read unduplicated data from a data table:

mysql> SELECT last_name, first_name
 -> FROM person_tbl
 -> GROUP BY (last_name, first_name);

4. Delete duplicate data

If you want to delete duplicate data from a table, you can use the following SQL statement:

mysql> CREATE TABLE tmp SELECT last_name, first_name, sex
 ->     FROM person_tbl;
 ->     GROUP BY (last_name, first_name);
mysql> DROP TABLE person_tbl;
mysql> ALTER TABLE tmp RENAME TO person_tbl;

You can also delete duplicate entries in a table by adding INDEX (index) and PRIMAY (primary key) as simple methods. The methods are as follows:

mysql> ALTER IGNORE TABLE person_tbl
 -> ADD PRIMARY KEY (last_name, first_name);

The above is about MySQL processing duplicate data all content, hope to help you learn.