blindguy blindguy - 6 months ago 12
SQL Question

Compare 2 or more column values in arbitrary order

I need a quick way to compare 2 or more values from different tables where the orders are arbitrarily stored in sql server. The data comes from a 3rd party who will not change.

Example data below shows the same item described in two ways. the remaining columns contain other data that i am joining.

table1
i j other columns...
1 2 ...


table2
i j other columns
2 1 ...
1 2 ...


right now for 2, i do a union query to cover both directions (i=i, j=j / i=j, j=i) . but if you expand to 3, that is 9 possible orders.

SELECT * FROM Table1 INNER JOIN Table2 ON Table1.i = Table2.i AND Table1.j = Table2.j

UNION

SELECT * FROM Table1 INNER JOIN Table2 ON Table1.i = Table2.j AND Table1.j = Table2.i


is there a way to order data returned from the first two columns before doing the comparison so i don't have to create all the unions?

Answer

Edit: New xml approach

I wonder how this approach performs:

select  *, cast(    '<c>' + cast(i as varchar) + '</c>' +
                    '<c>' + cast(j as varchar) + '</c>' +
                    '<c>' + cast(k as varchar) + '</c>'
            as xml).query('for $a in /c order by $a descending return $a').value('.', 'varchar(100)')
from    @Table1 o

This can be wrapped in a function and referenced in a persisted column... which should scale very well for you:

create table dbo.Table1 (pk int identity(1,1) primary key, i int, j int, k int);
insert into dbo.Table1
    values(1, 2, 3), (3, 1, 2), (4, 5, 6), (9,9,9);
go

create function dbo.fn_GenerateCompare(@i int, @j int, @k int)
returns varchar(100)
with schemabinding
as
begin
return 
(
    select cast('<c>' + cast(@i as varchar) + '</c>' + 
                '<c>' + cast(@j as varchar) + '</c>' +
                '<c>' + cast(@k as varchar) + '</c>'
    as xml).query('for $a in /c order by $a descending return $a').value('.', 'varchar(100)')
);
end

alter table dbo.Table1
    add Compare as dbo.fn_GenerateCompare(i, j, k) persisted;


select * from dbo.Table1

Returns:

pk  i   j   k   Compare
--  -   -   -   -------
1   1   2   3   321
2   3   1   2   321
3   4   5   6   654
4   9   9   9   999

Your query should now be really simple. Slap an index on the new Compare column and it should fly.




Original Post:

I like the sorted list idea proposed by Thorsten. Heres a rough idea of how it might be done. Performance would be greatly improved by persisting this compare column on the table (trigger or persisted computed column?)

declare @Table1 table (pk int identity(1,1) primary key, i int, j int, k int)
declare @Table2 table (pk int identity(1,1) primary key, i int, j int, k int)

insert into @Table1
values(1, 2, 3), (3, 1, 2), (4, 5, 6), (9,9,9)


insert into @Table2
values (2, 1, 3), (6, 4, 5)


--since the order is unimportant, concatenate the columns into a sorted array
--note how 1,2,3 and 3,1,2 both result in the same compare value:
select  *
from    @Table1 o
cross
apply   (   select  cast(value as varchar) + '/'
            from    @Table1
            unpivot (value for c in (i,j,k)) as u
            where   pk = o.pk
            order 
            by      value
            for xml path('')
        )d(compare)

--now, bring in the 2nd table
select  [src] = 1, pk, compare
from    @Table1 o
cross
apply   (   select  cast(value as varchar) + '/'
            from    @Table1
            unpivot (value for c in (i,j,k)) as u
            where   pk = o.pk
            order 
            by      value
            for xml path('')
        )d(compare)
union all
select  [src] = 2, pk, compare
from    @Table2 o
cross
apply   (   select  cast(value as varchar) + '/'
            from    @Table2
            unpivot (value for c in (i,j,k)) as u
            where   pk = o.pk
            order 
            by      value
            for xml path('')
        )d(compare)


--now just group them to find the matching rows
select min(src), min(pk), compare
from    (   
            select  [src] = 1, pk, compare
            from    @Table1 o
            cross
            apply   (   select  cast(value as varchar) + '/'
                        from    @Table1
                        unpivot (value for c in (i,j,k)) as u
                        where   pk = o.pk
                        order 
                        by      value
                        for xml path('')
                    )d(compare)
            union all
            select  [src] = 2, pk, compare
            from    @Table2 o
            cross
            apply   (   select  cast(value as varchar) + '/'
                        from    @Table2
                        unpivot (value for c in (i,j,k)) as u
                        where   pk = o.pk
                        order 
                        by      value
                        for xml path('')
                    )d(compare)
        )grouped
group
by      compare
having  count(*) > 1;
Comments