Héléna Héléna - 2 months ago 5
SQL Question

sql: How to remove duplicate rows (content the same, but sequence is different)

You see that the SKU1 has 2 rows, but actually the content of these 2 rows are the same, just the sequence of "b" and "c" makes difference.

What if I want to remove the duplicate rows as shown in the 2nd picture?

In Oracle there is a

function that can realize it, but I used SQL Server, therefore it doesn't work following the instruction of the below post:

How to remove duplicate rows in SQL

enter image description here


If it's only 2 columns where the order should not matter for the group by?

Then you could use IIF (or a CASE WHEN) to calculate the maximum and minimum values.
And use those calculated values in the GROUP BY.

For example:

select Name, 
MAX(Val1) as Val1,
MIN(Val2) as Val2
from Table1
IIF(Val2 is null or Val1 < Val2, Val1, Val2),
IIF(Val1 is null or Val1 < Val2, Val2, Val1);

For the example records that would give the result:

Name Val1 Val2
SKU1 20   10
SKU2 20   10

Or if you want to use a fancy XML trick :

select Name, max(Val1) as Val1, min(Val2) as Val2
from (
  select *, 
    ).query('for $n in /n order by $n return string($n)'
  ) as varchar(6)) as SortedValues
  from Table1
) q
group by Name, SortedValues;

The last method could be more usefull when there are more columns involved.

To actually remove the duplicates?
Here's an example that uses a table variable to demonstrate:

declare @Table1 TABLE (Id int, Name varchar(20), Val1 int, Val2 int);

Insert Into @Table1 values

delete from @Table1
where Id in (
    select Id
    from (
    select Id, 
    row_number() over (partition by Name, 
         IIF(Val2 is null or Val1 < Val2, Val1, Val2),
         IIF(Val1 is null or Val1 < Val2, Val2, Val1)
       order by Val1 desc, Val2 desc
    ) as rn
    from @Table1
    ) q
    where rn > 1

select * from @Table1;