James Poulson James Poulson - 6 months ago 15
SQL Question

Delete all but one duplicate record

I have a table that is supposed to keep a trace of visitors to a given profile (user id to user id pair). It turns out my SQL query was a bit off and is producing multiple pairs instead of single ones as intended. With hindsight I should have enforced a unique constraint on each id+id pair.

Now, how could I go about cleaning up the table? What I want to do is delete all duplicate pairs and leave just one.

So for example change this:

23515 -> 52525 date_visited
23515 -> 52525 date_visited
23515 -> 52525 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
23515 -> 52525 date_visited
...


Into this:

23515 -> 52525 date_visited
12345 -> 54321 date_visited


Update: Here is the table structure as requested:

id int(10) UNSIGNED Non Aucun AUTO_INCREMENT
profile_id int(10) UNSIGNED Non 0
visitor_id int(10) UNSIGNED Non 0
date_visited timestamp Non CURRENT_TIMESTAMP

Answer

Use group by in a subquery:

delete from my_tab where id not in 
(select min(id) from my_tab group by profile_id, visitor_id);

You need some kind of unique identifier(here, I'm using id).

UPDATE

As pointed out by @JamesPoulson, this causes a syntax error in MySQL; the correct solution is (as shown in James' answer):

delete from `my_tab` where id not in
( SELECT * FROM 
    (select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
);