user2964644 user2964644 - 7 months ago 10
SQL Question

Compare five columns to five other columns in SQL

I want to compare five columns (x1-x5) to five other columns (y1-y5), to determine the extent to which they agree (in any order).

A score ranging from 0 to 5 is awarded, for the number of values (in x1-x5) that matched (in any of the five columns y1-y5).

I could write this out as a lengthy series of CASE WHEN statements, but is there a more efficient way to achieve this?

The values are all strings. Full text search is not available on the server.

The code below illustrates an example where n_found is what I am trying to create.

DROP TABLE mytable;
CREATE TABLE mytable(
id INTEGER NOT NULL PRIMARY KEY
,x1 VARCHAR(1) NOT NULL
,x2 VARCHAR(1) NOT NULL
,x3 VARCHAR(1) NOT NULL
,x4 VARCHAR(1) NOT NULL
,x5 VARCHAR(1) NOT NULL
,y1 VARCHAR(1) NOT NULL
,y2 VARCHAR(1) NOT NULL
,y3 VARCHAR(1) NOT NULL
,y4 VARCHAR(1) NOT NULL
,y5 VARCHAR(1) NOT NULL
,n_found INTEGER NOT NULL
);
INSERT INTO mytable(id,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,n_found)
VALUES (1,'a','b','c','d','e','r','a','t','y','z',1);
INSERT INTO mytable(id,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,n_found)
VALUES (2,'e','a','b','d','c','m','o','a','b','z',2);
INSERT INTO mytable(id,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,n_found)
VALUES (3,'a','b','c','d','e','f','g','h','i','j',0);
INSERT INTO mytable(id,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,n_found)
VALUES (4,'a','b','c','d','e','b','e','a','c','d',5);
INSERT INTO mytable(id,x1,x2,x3,x4,x5,y1,y2,y3,y4,y5,n_found)
VALUES (5,'a','b','c','e','d','d','b','x','y','z',2);

Answer

You could use CROSS APPLY and Table Value Constructor:

SELECT *
FROM mytable
CROSS APPLY (SELECT COUNT(*)
             FROM (VALUES (x1),(x2),(x3),(x4),(x5)) AS t1(v)
             WHERE v IN (SELECT v
                         FROM (VALUES (y1),(y2),(y3),(y4),(y5)) AS t2(v))
            ) AS s(found);

LiveDemo

Output:

╔═════════════════════════════════════════════════╗
║ id x1 x2 x3 x4 x5 y1 y2 y3 y4 y5 n_found found  ║
╠═════════════════════════════════════════════════╣
║ 1  a  b  c  d  e  r  a  t  y  z  1       1      ║
║ 2  e  a  b  d  c  m  o  a  b  z  2       2      ║
║ 3  a  b  c  d  e  f  g  h  i  j  0       0      ║
║ 4  a  b  c  d  e  b  e  a  c  d  5       5      ║
║ 5  a  b  c  e  d  d  b  x  y  z  2       2      ║
╚═════════════════════════════════════════════════╝

Note:

If you want to handle NULLs you could use COALESCE to exchange NULL with value that you are sure does not exists:

COALESCE(x1,'^')

LiveDemo2

EDIT:

This seems to evaluate NULL as agreeing. Can you modify it so that NULL compared to NULL does not match?

As I said before it is already working that way.

LiveDemo3

╔════╦════╦════╦════╦════╦══════╦════╦════╦════╦════╦══════╦═════════╦═══════╗
║ id ║ x1 ║ x2 ║ x3 ║ x4 ║  x5  ║ y1 ║ y2 ║ y3 ║ y4 ║  y5  ║ n_found ║ found ║
╠════╬════╬════╬════╬════╬══════╬════╬════╬════╬════╬══════╬═════════╬═══════╣
║  6 ║ a  ║ a  ║ c  ║ e  ║ NULL ║ o  ║ y  ║ x  ║ y  ║ NULL ║       0 ║     0 ║
╚════╩════╩════╩════╩════╩══════╩════╩════╩════╩════╩══════╩═════════╩═══════╝

EDIT 2:

To handle NULL you could also use INTERSECT (ALL) equivalent:

SELECT *
FROM mytable
CROSS APPLY (SELECT COUNT(*)
             FROM (SELECT v,
                       ROW_NUMBER() OVER(PARTITION BY v ORDER BY (SELECT 1)) AS c
                   FROM (VALUES (x1),(x2),(x3),(x4),(x5)) AS t1(v)
                   INTERSECT
                   SELECT v,
                       ROW_NUMBER() OVER(PARTITION BY v ORDER BY (SELECT 1))
                   FROM (VALUES (y1),(y2),(y3),(y4),(y5)) AS t2(v)
                  ) AS s2
            ) AS s(found)

LiveDemo4