Write a SQL query to delete all duplicate email entries in a table named Person
, keeping only unique emails based on its smallest Id.
+----+------------------+ | Id | Email | +----+------------------+ | 1 | john@example.com | | 2 | bob@example.com | | 3 | john@example.com | +----+------------------+ Id is the primary key column for this table.
For example, after running your query, the above Person
table should have the following rows:
+----+------------------+ | Id | Email | +----+------------------+ | 1 | john@example.com | | 2 | bob@example.com | +----+------------------+
這道題讓我們刪除重復郵箱,那我們可以首先找出所有不重復的郵箱,然后取個反就是重復的郵箱,都刪掉即可,那么我們如何找出所有不重復的郵箱呢,我們可以按照郵箱群組起來,然后用Min關鍵字挑出較小的,然后取補集刪除即可:
解法一:
DELETE FROM Person WHERE Id NOT IN (SELECT Id FROM (SELECT MIN(Id) Id FROM Person GROUP BY Email) p);
我們也可以使用內交讓兩個表以郵箱關聯起來,然后把相同郵箱且Id大的刪除掉,參見代碼如下:
解法二:
DELETE p2 FROM Person p1 JOIN Person p2 ON p2.Email = p1.Email WHERE p2.Id > p1.Id;
我們也可以不用Join,而直接用where將兩表關聯起來也行:
解法三:
DELETE p2 FROM Person p1, Person p2 WHERE p1.Email = p2.Email AND p2.Id > p1.Id;
類似題目:
參考資料:
https://leetcode.com/discuss/61176/simple-solution-using-a-self-join
https://leetcode.com/discuss/48403/my-answer-delete-duplicate-emails-with-double-nested-query