Nic Nic - 5 months ago 9
MySQL Question

Mysql DELETE rows where dates are not the newest

I have a pretty interesting request. I have Mysql table that has a bunch of data (around 8000 rows). Unfortunately this table contains many duplicate rows. These rows contain data such as an

id
,
first_name
,
last_name
, and a
last_modified
date.

Sets of "duplicate" rows all contain the same
first_name
and
last_name
, however; the
id
's and the
last_modified
date's are all different.

Luckily, the last modified date is how I will be able to tell which row in the duplicates is actually the correct one.

In other words, I would like to delete all rows given a specific
first_name
and
last_name
where the
last_modified
date is not the newest.

For example, if I have some rows that look like this:

1 Bob Smith 2015-04-19
2 Bob Smith 2016-04-19
3 Bob Smith 2014-04-19
4 Richard McDonald 2008-04-19
5 Joe Blo 2011-04-19
6 Jo Blo 2010-04-19


Note: The years are different so for each person only the row with the most recent year should be kept.

I would like a query that is going to only keep the row given a first name and last name that is the newest (latest) date. So if I ran my desired query the rows would look like this after:

2 Bob Smith 2016-04-19
4 Richard McDonald 2008-04-19
5 Joe Blo 2011-04-19


Can anybody write me a generic query that can accomplish this? Thank so much.

Answer

You can do this with delete and join:

delete t
    from t join
         (select first_name, last_name, max(last_modified) as maxlm
          from t
          group by first_name, last_name
         ) tt
         on t.first_name = tt.first_name and t.last_name = tt.last_name and
            t.last_modified < tt.maxlm;