SQL interview question: find the sum of time difference of has repetition
- 2021-12-19 07:04:17
- OfStack
When interviewing for BI position in a certain company, one sql question in the interview question is very simple at first glance. When writing it, I found that I lacked a summary and didn't write it quickly.
The topics are as follows:
Seek the promotion days of each brand
Table sale is a promotion and marketing table, and there are duplicate dates in the data, for example, end_date with id 1 is 20180905, start_date with id 2 is 20180903, that is, there are duplicate sales dates with id 1 and id 2, and the promotion days of each brand are calculated (duplicate is not counted)
Table results are as follows:
+------+-------+------------+------------+
| id | brand | start_date | end_date |
+------+-------+------------+------------+
| 1 | nike | 2018-09-01 | 2018-09-05 |
| 2 | nike | 2018-09-03 | 2018-09-06 |
| 3 | nike | 2018-09-09 | 2018-09-15 |
| 4 | oppo | 2018-08-04 | 2018-08-05 |
| 5 | oppo | 2018-08-04 | 2018-08-15 |
| 6 | vivo | 2018-08-15 | 2018-08-21 |
| 7 | vivo | 2018-09-02 | 2018-09-12 |
+------+-------+------------+------------+
The final result should be
brand | all_days |
---|---|
nike | 13 |
oppo | 12 |
vivo | 18 |
Table building statement
-- ----------------------------
-- Table structure for sale
-- ----------------------------
DROP TABLE IF EXISTS `sale`;
CREATE TABLE `sale` (
`id` int(11) DEFAULT NULL,
`brand` varchar(255) DEFAULT NULL,
`start_date` date DEFAULT NULL,
`end_date` date DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
-- ----------------------------
-- Records of sale
-- ----------------------------
INSERT INTO `sale` VALUES (1, 'nike', '2018-09-01', '2018-09-05');
INSERT INTO `sale` VALUES (2, 'nike', '2018-09-03', '2018-09-06');
INSERT INTO `sale` VALUES (3, 'nike', '2018-09-09', '2018-09-15');
INSERT INTO `sale` VALUES (4, 'oppo', '2018-08-04', '2018-08-05');
INSERT INTO `sale` VALUES (5, 'oppo', '2018-08-04', '2018-08-15');
INSERT INTO `sale` VALUES (6, 'vivo', '2018-08-15', '2018-08-21');
INSERT INTO `sale` VALUES (7, 'vivo', '2018-09-02', '2018-09-12');
Mode 1:
The method of using one record under self-association
select brand,sum(end_date-befor_date+1) all_days from
(
select s.id ,
s.brand ,
s.start_date ,
s.end_date ,
if(s.start_date>=ifnull(t.end_date,s.start_date) ,s.start_date,DATE_ADD(t.end_date,interval 1 day) ) as befor_date
from sale s left join (select id+1 as id ,brand,end_date from sale) t on s.id = t.id and s.brand = t.brand
order by s.id
)tmp
group by brand
Running result
+-------+---------+
| brand | all_day |
+-------+---------+
| nike | 13 |
| oppo | 12 |
| vivo | 18 |
+-------+---------+
This method is valid for the table in this topic, but it is not applicable to the records of brands with discontinuous id.
Mode 2:
SELECT a.brand,SUM(
CASE
WHEN a.start_date=b.start_date AND a.end_date=b.end_date
AND NOT EXISTS(
SELECT *
FROM sale c LEFT JOIN sale d ON c.brand=d.brand
WHERE d.brand=a.brand
AND c.start_date=a.start_date
AND c.id<>d.id
AND (d.start_date BETWEEN c.start_date AND c.end_date AND d.end_date>c.end_date
OR
c.start_date BETWEEN d.start_date AND d.end_date AND c.end_date>d.end_date)
)
THEN (a.end_date-a.start_date+1)
WHEN (a.id<>b.id AND b.start_date BETWEEN a.start_date AND a.end_date AND b.end_date>a.end_date ) THEN (b.end_date-a.start_date+1)
ELSE 0 END
) AS all_days
FROM sale a JOIN sale b ON a.brand=b.brand GROUP BY a.brand
Running result
+-------+----------+
| brand | all_days |
+-------+----------+
| nike | 13 |
| oppo | 12 |
| vivo | 18 |
+-------+----------+
In which condition
d.start_date BETWEEN c.start_date AND c.end_date AND d.end_date>c.end_date
OR
c.start_date BETWEEN d.start_date AND d.end_date AND c.end_date>d.end_date
It can be replaced by
c.start_date < d.end_date AND (c.end_date > d.start_date)
The result is also correct
It is also feasible to use the analysis function. I didn't install oracle on my computer for the time being, but I wrote it with mysql.