1、首先查一下看看重复了多少,按url分组,并且count大于1的就说明是重复的。

select url,count(*) from yg_articles t group by t.url having count(*)>1

2、我这里查到是有1500条重复,都是重复了1次,那么我们就把id大的那个删掉就行,先把大id查出来

select max(id) from yg_articles t group by t.url having count(*)>1

3、然后把大id删掉

delete from yg_articles 
where id in (select max(id) from yg_articles t group by t.url having count(*)>1)

不出意外的报错了,[Err] 1093 – You can’t specify target table ‘yg_articles ‘ for update in FROM clause
不要慌,这个意思是说,你不能用yg_articles的子查询来删yg_articles,典型的我杀我自己,这样容易死锁,所以被禁用了

怎么办呢?两个方法,一种是搞个临时表b,把查到的结果先insert进去,然后delete from yg_articles where id in (select id from b)就搞定了。

还有一种更简单的,就是把子查询弄个别名,当成一个子表b来用,效果也是一样。
先select验证一下出来的条数对不对,我有一次不小心写错sql就删除了全表(还好有备份)

select * from yg_articles 
where id in (
   select id from (select max(id) id from yg_articles t group by t.url having count(*)>1) b
)

然后执行删除,真正执行前,一定要先备份,一定要先备份,一定要先备份,

delete from yg_articles 
where id in (
   select id from (select max(id) from yg_articles t group by t.url having count(*)>1) b
)

最后,如果你重复的不是2个,而是3个、4个怎么办呢?删最大的,删完了还是有重复的。那就多执行几次。最后他就只剩一个了。

或者你反着写一下sql,删除所有存在重复的id,并且not id 最小id的子查询,这样就可以一次搞定。

delete from yg_articles 
where url in (
   select url from (select url from yg_articles t group by t.url having count(*)>1) b
)
and id not in (
   select id from (select min(id) as id from yg_articles t group by t.url having count(*)>1) c
)