user3051065 user3051065 - 5 months ago 22
SQL Question

SQL Server: Comparing Concatenated Columns for Equality of Unordered Set

I have two tables with an ID column (that matches the tables together) and concatenated columns where the concatenated values are in an arbitrary order. I want to compare to see if the two columns contain the exact same items (in any order) and output the ID if they don't.

Example:

Table 1

PersonID Products
1 Apple|Pear|Orange
2 Flour|Apple|Butter
3 Apple
4 Banana|Cashews
5 Juice|Crackers|Banana|Cashews
6 Cashews


Table 2

PersonID Products
1 Orange|Apple|Pear
2 Flour|Apple|Butter
3 Apple|Banana
4 Banana
5 Crackers|Juice|Banana|Cashews
6 Pear|Crackers


I would like to get all the personids where products are not an identical set (any order) between table 1 and table 2. So in this case that is:
Person 3 (extra product), person 4 (missing product) and person 6 (different products).

My current query is erroneously picking up Persons 1 and 5 because they have products ordered differently.

My current query is like this:

select t1.personid, t1.products as t1products, t2.products as t2products
from table1 t1 (nolock)
inner join table2 t2 (nolock) on t1.personid = t2.personid
where t1.products != t2.products


I also have the data in pre-concatenation form, with multiple rows per personid (one for each product, and again separately in two tables), if that is more helpful - I haven't figured out how to concatenate them in alphabetical order, so a solution to that effect would also solve this problem.

Edit (clarification):
The unconcatenated data looks like the following:

Table 1

PersonID Product
1 Apple
1 Pear
1 Orange
2 Flour
2 Apple
2 Butter
3 Apple


etc.

Table 2

PersonID Product
1 Orange
1 Apple
1 Pear
2 Flour
2 Apple
2 Butter
3 Apple
3 Banana


etc.

I used STUFF to concatenate them by PersonID.

Answer

If you have it in the form of one product for each row then you can query for all the results that don't have a match for both product and personid in the opposite table. Then do the same for the other table and union the results:

SELECT t1.personid, t1.product, '2' AS [Not Found In Table]
FROM table1 t1 (nolock)
LEFT JOIN table2 t2 ON t1.personid = t2.personid AND t1.product = t2.product
WHERE t2.product IS NULL
UNION
SELECT t2.personid, t2.product,  '1' AS [Not Found In Table]
FROM table2 t2 (nolock)
LEFT JOIN table1 t1 ON t2.personid = t1.personid AND t2.product = t1.product
WHERE t1.product IS NULL

You could wrap this in a select and CONCAT the results to feed you a nice list of what is missing from which table for each person who doesn't match.