bnussey bnussey - 6 months ago 38
SQL Question

Rails: Delete duplicate records based on multiple columns

In our system we run hourly imports from an external database. Due to an error in the import scripts, there are now some duplicate records.

A duplicate is deemed where any record has the same


What code can I run to find and delete these duplicates?

I was playing around with this:,:company).group(:legacy_id,:company).having("count(*) > 1")

It seemed to return some of the duplicates, but I wasn't sure how to delete from there?

Any ideas?


You can try the following approach:

Product.group_by{|x| [x.legacy_id,]}
       .flat_map{|_,x| x.drop(1)}

Or pure ActiveRecord:

Product.where.not(id:, :company).pluck('min('))

Or pure sql:

delete from products
where id not in ( 
   select min( from products p group by p.legacy_id,