GeoGeoGeometry GeoGeoGeometry - 4 months ago 10
SQL Question

Finding likeness between one row and many rows

Disclaimer: I'm very new to MySQL, doing this as an experiment to learn more.

Essentially, I have a table where one row is the 'master' row. I want to compare this row against a handful of other rows. From that handful, I want to select the one that is the most 'correct' or similar to the master row and store it to a variable. If none of the handful rows are similar enough to the master, I'd just want to cancel the operation.

I can think of a way to do this with iteration, but I keep reading that iteration is very inefficient/bad practice with SQL and want to see if there's another way.

With this example, I'd want to select submission 2, and reject submissions 1 and 3.

Type, Answer 1, Answer 2, Answer 3
Master, 100, 200, 300
Submission 1, 100, 400, 300
Submission 2, 100, 200, 300
Submission 3, 200, 100, 300


However, in the following example, I'd want to select submission 1, despite the two not being exactly alike.

Type, Answer 1, Answer 2, Answer 3
Master, 100, 200, 300
Submission 1, 100, 400, 300
Submission 2 , 100, 500, 500
Submission 3, 200, 100, 300


Finally, here I'd want to select Submission 1, as it occurs before Submission 3, despite the two being the same degree of correct.

Type, Answer 1, Answer 2, Answer 3
Master, 100, 200, 300
Submission 1, 100, 400, 300
Submission 2 , 100, 500, 500
Submission 3, 500, 200, 300

Answer

Try the following query. It computes the number of answers which the Master row has in common with each other row. And in the case of tie by number of right answers, it uses the numerical difference from each wrong answer to choose the closest match.

SELECT t.Type, t.Answer1, t.Answer2, t.Answer3,
       (t.a1 + t.a2 + t.a3) AS numRight,
       (t.d1 + t.d2 + t.d3) AS numOff
FROM
(
    SELECT t2.Type, t2.Answer1, t2.Answer2, t2.Answer3,
           CASE WHEN t1.Answer1 = t2.Answer1 THEN 1 ELSE 0 END AS a1,
           CASE WHEN t1.Answer2 = t2.Answer2 THEN 1 ELSE 0 END AS a2,
           CASE WHEN t1.Answer3 = t2.Answer3 THEN 1 ELSE 0 END AS a3,
           ABS(t1.Answer1 - t2.Answer1) ASd1,
           ABS(t1.Answer2 - t2.Answer2) AS d2,
           ABS(t1.Answer3 - t2.Answer3) AS d3
    FROM yourTable t1 INNER JOIN yourTable t2
        ON t1.Type = 'Master' AND t2.Type <> 'Master'
) t
ORDER BY numRight DESC, numOff

If you just want one row, you can add LIMIT 1 to the end of the query.

Demo here:

SQLFiddle