webdad3 webdad3 - 4 months ago 10
SQL Question

My INNER JOIN is filtering out too much

My INNER JOIN is filtering out too much or I can't see what the difference is.

I am trying to consolidate 2 tables from 2 different databases on the same server.

My first step is to find all the values that match:

SELECT intl.*
FROM tblData intl
INNER JOIN [db].dbo.tblData us
ON intl.DataID = us.DataID
AND intl.AnotherID = us.AnotherID
AND intl.DataValue = us.DataValue
AND intl.TextValue = us.TextValue
AND intl.DefaultValue = us.DefaultValue
AND intl.RateValue = us.RateValue
AND intl.YetAnotherID = us.YetAnotherID


I've checked all the columns and they are the same type and same varchar values. So this query should bring back all the matching values.

However... That is not the case. It is only returning 200+ records, where if I run this query after I insert the 200+ records into a temp table:

SELECT intl.DataID
FROM tblData intl
LEFT JOIN TempDataTable TDT
ON TDT.DataID = intl.DataID
AND TDT.AnotherID = intl.AnotherID
AND TDT.DataValue = intl.DataValue
AND TDT.TextValue = intl.TextValue
AND TDT.DefaultValue = intl.DefaultValue
AND TDT.RateValue = intl.RateValue
AND TDT.YetAnotherID = intl.YetAnotherID
WHERE TDT.DataID IS NULL


This query returns 1500+ rows.

I've run a similar query with the US data and found the same discrepancy (1500+ rows returned).

I looked at 1 record just to see if I could identify something whacky, and I found that the records from each table (INTL and US) are EXACTLY the same!

DataID: 1
AnotherID: 1
DataValue: NULL
TextValue: NORMAL
DefaultValue: 0
RateValue: NULL
YetAnotherID: 1


I imagine my inner join is finding a difference between the 2 tables structurally but I can not see it.

Any ideas on what would cause this?

Answer

This is likely caused by the NULL values that you have in the table failing the equality checks of your LEFT JOIN.

NULL values aren't equal to anything, not even to another NULL of the same datatype.

To remedy this, you can add a check to the ON clause for NULL values on both sides in addition to the existing JOIN conditions:

SELECT intl.DataID
FROM   tblData intl
LEFT JOIN TempDataTable TDT
    ON  (
            (TDT.DataID IS NULL AND intl.DataID IS NULL)            
            OR TDT.DataId = intl.DataId
        )
    AND (
            (TDT.AnotherID IS NULL AND intl.AnotherID IS NULL)      
            OR TDT.AnotherId = intl.AnotherId
        )
    AND (
            (TDT.DataValue IS NULL AND intl.DataValue IS NULL)      
            OR TDT.DataValue = intl.DataValue
        )
    AND (
            (TDT.TextValue IS NULL AND intl.TextValue IS NULL)      
            OR TDT.TextValue = intl.TextValue
        )
    AND (
            (TDT.DefaultValue IS NULL AND intl.DefaultValue IS NULL) 
            OR TDT.DefaultValue = intl.DefaultValue
        )
    AND (
            (TDT.RateValue IS NULL AND intl.RateValue IS NULL) 
            OR TDT.RateValue = intl.RateValue
        )
    AND (
            (TDT.YetAnotherID IS NULL AND intl.YetAnotherID IS NULL) 
            OR TDT.YetAnotherID = intl.YetAnotherID
        )
WHERE  TDT.DataID IS NULL