sql删除重复数据


用爬虫爬了一些数据,但是有些标题是重复的,需要删除,所以找了一下删除重复标题数据的sql。

# 查询所有重复的数据
select * 
FROM
	tb_xici_article 
WHERE
	post_title IN ( SELECT post_title FROM tb_xici_article GROUP BY post_title HAVING count( post_title ) > 1 )
	
	
# 查询所有重复并且id不是最小的那些重复数据
SELECT
	* 
FROM
	tb_xici_article 
WHERE
	post_title IN ( SELECT post_title FROM tb_xici_article GROUP BY post_title HAVING count( post_title ) > 1 ) 
	AND id NOT IN ( SELECT min( id ) FROM tb_xici_article GROUP BY post_title HAVING count( post_title ) > 1 )
	

# 删除重复数据
如果直接按下面这样写,mysql会报You can't specify target table for update in FROM clause错误,需要把select出的结果再通过中间表select一遍
DELETE
FROM
	tb_xici_article 
WHERE
	post_title IN ( SELECT post_title FROM tb_xici_article GROUP BY post_title HAVING count( post_title )
	> 1 ) and id not in (select min(id) from tb_xici_article group by post_title HAVING count(post_title) > 1)
	
#	最终版
DELETE 
FROM
	tb_xici_article 
WHERE
	id IN (
	SELECT
		temp.id 
	FROM
		(
		SELECT
			* 
		FROM
			tb_xici_article 
		WHERE
			post_title IN ( SELECT post_title FROM tb_xici_article GROUP BY post_title HAVING count( post_title ) > 1 ) 
			AND id NOT IN ( SELECT min( id ) FROM tb_xici_article GROUP BY post_title HAVING count( post_title ) > 1 ) 
		) temp 
	)
	

  


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM