Chun ping Wang Chun ping Wang - 6 months ago 20
MySQL Question

Query results with consecutive duplicates remove

Hi i want to know how to remove consecutive duplicates from a query results based on single column. In this case null means they didn't want to buy anything so they click back button. I want history of changes of items from buyer, with earliest buying info.

select item, buyer, buy_date
from item order by buy_date

item buyer buy_date
null Sam 04/24/2016
Lipstick Anna 05/31/2016
Charger Tim 06/01/2016
Charger James 06/03/2016
null Tim 06/03/2016
null James 06/04/2016
Nail Polish Sarah 06/04/2016


Here is sample of test results.

Now my new results should be

item buyer buy_date
null Sam 04/24/2016
Lipstick Anna 05/31/2016
Charger Tim 06/01/2016
null Tim 06/03/2016
Nail Polish Sarah 06/04/2016


Keep only the first of the consecutive duplicates. I am not deleting any record. I am just filtering out the results so that consecutive duplicates are remove.


  1. How would I do this in generic ANSI-SQL?

  2. If it is not possible, is there way to do it in the three major SQL vendor?


Answer

You can do this with ANSI standard window functions. One method uses lag() to get the previous item. It then sums up the number of times when the item changes and uses this information to find the first row in each group:

select r.*
from (select r.*, row_number() over (partition by grp order by date) as seqnum
      from (select r.*,
                   sum(case when prev_item = item then 0 else 1 end) over (order by buy_date) as grp
            from (select r.*, lag(item) over (order by buy_date) as prev_item
                  from results r
                 ) r
           ) r
     ) r
where seqnum = 1;

Hmmm. I overthought that one. You only need lag():

select r.*
from (select r.*, lag(item) over (order by buy_date) as prev_item
      from results r
     ) r
where prev_item is null or prev_item <> item;