In mysql: single table distinct multi table group by queries remove duplicate records

  • 2020-05-27 07:22:19
  • OfStack

One-only queries for a single table use: distinct
One-only queries for multiple tables use: group by
When distinct queries multiple tables, left join is still valid, all connections are invalid,
When using mysql, it is sometimes necessary to query for records that are not duplicated in a field. Although mysql provides the keyword distinct to filter out redundant duplicate records and keep only one record, it is often used only to return the number of non-duplicate records, rather than all the values of non-duplicate records. The reason is that distinct can only return its target field, but cannot return other fields. If it cannot be solved by distinct, I can only solve it by double cyclic query. For a station with a large amount of data, it will undoubtedly directly affect the efficiency.
Here's an example:
The structure of the table is as follows:
id name
1 a
2 b
3 c
4 c
5 b
The basic table structure looks like this, this is just a simple example, the actual multi-table query and so on situation will be much more complex.
For example, if I want to query with one statement to get all the data that name does not repeat, then I must use distinct to get rid of the redundant duplicate records.
select distinct name from table
The result is:
name
a

c
It seems to work, but what I want to get is the id value? Change 1 to the following query statement:
select distinct name, id from table
The result would be:
id name
1 a
2 b
3 c
4 c
5 b
Why didn't distinct work? It actually works, but it also works on two fields, so you have to have id and name are the same to be excluded.
Let's change the query statement:
select id, distinct name from table
Unfortunately, you get nothing but error messages, distinct must be at the beginning. Is it too hard to put distinct in the where condition? Try and report the same error.

Finally, I found a usage in mysql manual. I used group_concat(distinct name) and group by name to achieve the function I needed. I was excited.
Report an error, depressed!
Even the mysql manual took it against me, giving me hope and pushing me into despair.
On closer inspection, group_concat function is 4.1 support, halo, I 4.0. No way, upgrade, upgrade level 1 try, success.
Finally done, but so 1 to, and must ask the customer to upgrade.
Now that you can use the group_concat function, what about the other functions?
Try count function 1, and it will work. It took so much time.
Now release the full statement:
select *, count(distinct name) from table group by name
Results:
id name count(distinct name)
1 a 1
2 b 1
3 c 1
The last one is redundant, just ignore it, and you get what you want.
It turns out that mysql is so stupid that he was cheated with one stroke. I hope you won't be troubled by this problem.
By the way, group by must be placed before order by and limit, otherwise an error will be reported.
Say 1 for a practical example of group by:

$sql = 'select DISTINCT n.nid,tn.tid,n.title,n.created,ni.thumbpath from {term_node} tn INNER JOIN {node} n ON n.nid=tn.nid INNER JOIN {node_images} ni ON ni.nid=n.nid where tn.tid IN('.implode(',', $tids).') ORDER BY n.nid DESC';
$res = db_query($sql);
$t_data = array();
while($r = db_fetch_array($res)) {
print_r($r);
}

When you use this query, there are always two cases of the same nid, such as the following result

Array
(
[created] => 1215331278
[nid] => 1603
[tid] => 32
[title] =>  Summer wedding green drink DIY
[thumbpath] => files/node_images/home-77.1_tn.jpg
)
Array
(
[created] => 1215331278
[nid] => 1603
[tid] => 32
[title] =>  Summer wedding green drink DIY
[thumbpath] => files/node_images/003_primary_tn.jpg
)

DISTINCT doesn't work, it actually works, but I think nid is only 1 in the query structure.
Finally, group by was used

$sql = 'select
n.nid,tn.tid,n.title,n.created,ni.thumbpath from {term_node} tn INNER
JOIN {node} n ON n.nid=tn.nid INNER JOIN {node_images} ni ON
ni.nid=n.nid where tn.tid IN('.implode(',', $tids).') GROUP BY
n.nid DESC';
$res = db_query($sql);
$t_data = array();
while($r = db_fetch_array($res)) {
print_r($r);
}

I get nid is the only one.

Related articles: