Encyclopedia of Methods for Querying and Deleting Duplicate Records in MySQL

  • 2021-08-12 03:51:31
  • OfStack

Preface

This article mainly introduces the methods of querying and deleting duplicate records in MySQL, and shares them for your reference and study. Let's take a look at the detailed introduction below:

Find records for all duplicate headings:


select title,count(*) as count from user_table group by title having count>1; 

SELECT * FROM t_info a WHERE ((SELECT COUNT(*) FROM t_info WHERE Title = a.Title) > 1) ORDER BY Title DESC

1. Find duplicate records

1. Find all duplicate records


SELECT * FROM t_info a WHERE ((SELECT COUNT(*) FROM t_info WHERE Title = a.Title) > 1) ORDER BY Title DESC

2. Filter duplicate records (only one is displayed)


Select * From HZT Where ID In (Select Max(ID) From HZT Group By Title)

Note: Maximum 1 record of ID is shown here

2. Delete duplicate records

1. Delete all duplicate records (with caution)


Delete  Table  Where  Duplicate field  In (Select  Duplicate field  From  Table  Group By  Duplicate field  Having Count(*)>1)

2. Keep one (this should be what most people need _)


Delete HZT Where ID Not In (Select Max(ID) From HZT Group By Title)

Note: Maximum 1 record of ID is kept here

3. Examples

1. Lookup the redundant duplicate records in the table. The duplicate records are judged according to a single field (peopleId)


select * from people where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1)

2. Delete redundant duplicate records in the table. The duplicate records are judged according to a single field (peopleId), and only the records with the smallest rowid are left


delete from people where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1) and rowid not in (select min(rowid) from people group by peopleId having count(peopleId )>1)

3. Redundant duplicate records (multiple fields) in the lookup table


select * from vitae a where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1)

4. Delete redundant duplicate records (multiple fields) in the table, leaving only the records with the smallest rowid


delete from vitae a where (a.peopleId,a.seq) in (select peopleId,seq from vitae group by peopleId,seq having count(*) > 1) and rowid not in (select min(rowid) from vitae group by peopleId,seq having count(*)>1)

5. Redundant duplicate records (multiple fields) in the lookup table, excluding the records with the smallest rowid


SELECT * FROM t_info a WHERE ((SELECT COUNT(*) FROM t_info WHERE Title = a.Title) > 1) ORDER BY Title DESC
0

4. Additional

There are more than two duplicate records, 1 is a completely duplicate record, that is, a record with all fields duplicate, and 2 is a record with some key fields duplicate, such as Name field duplicate, while other fields are not duplicate or duplicate can be ignored.

1. For the first repetition, it is easier to solve and use


SELECT * FROM t_info a WHERE ((SELECT COUNT(*) FROM t_info WHERE Title = a.Title) > 1) ORDER BY Title DESC
1

You can get a result set without duplicate records.

If this table needs to delete duplicate records (keep 1 duplicate record), you can delete them as follows


SELECT * FROM t_info a WHERE ((SELECT COUNT(*) FROM t_info WHERE Title = a.Title) > 1) ORDER BY Title DESC
2

The reason for this duplication is due to poor table design, which can be solved by adding only 1 index column.

2. This kind of duplicate problem usually requires keeping the first record in duplicate records, and the operation method is as follows

Assuming that the duplicate fields are Name and Address, the result set of only 1 for these two fields is required


select identity(int,1,1) as autoID, * into #Tmp from tableName
select min(autoID) as autoID into #Tmp2 from #Tmp group by Name,autoID
select * from #Tmp where autoID in(select autoID from #tmp2)

Summarize


Related articles: