007 007 - 7 days ago 5
SQL Question

T-SQL Join NULL and NOT NULL Records

Code:

CREATE TABLE [dbo].[T1] ( [ID1] INT NOT NULL, [ID2] INT NOT NULL );
CREATE TABLE [dbo].[T2] ( [ID1] INT NOT NULL, [ID2] INT NULL, [VAL] INT NOT NULL );

INSERT INTO [dbo].[T1] ( [ID1], [ID2] )
VALUES ( 1, 1 )
,( 1, 2 )
,( 1, 3 )
,( 1, 4 )
,( 1, 5 )
,( 2, 1 );

INSERT INTO [dbo].[T2] ( [ID1], [ID2], [VAL] )
VALUES ( 1, NULL, 25000 )
,( 1, 2, 30000 )
,( 2, NULL, 30000 );


Goal:

T1 has a full list of ID1 and ID2 mapping.

T2 is a foreign table with some ID1/ID2 mapped to a Val.

The end goal is to "cross join" T1 and T2 (ID1/ID2) when T2.ID2 = NULL but skip the output of that T1.ID2 when T2.ID2 has a NOT NULL value and keep the VAL from T2 for NOT-NULL T2.ID2s. The performance has to be FAST!!!

Desired output:

ID1 ID2 VAL
1 1 25000
1 2 30000 -- T1.ID2 = 2 takes the priority
1 3 25000
1 4 25000
1 5 25000
2 1 30000


My Try:

SELECT [T2].[ID1]
, [T1].[ID2]
, [T2].[VAL]
FROM [dbo].[T1] [T1]
JOIN [dbo].[T2] [T2]
ON [T1].[ID1] = [T2].[ID1]
WHERE [T2].[ID2] IS NULL
OR [T1].[ID2] IN (SELECT [T3].[ID2]
FROM [dbo].[T2] [T3]
WHERE [T2].[ID1] = [T3].[ID1]
AND [T2].[ID2] = [T3].[ID2] )
--ORDER BY [T2].[ID1]
-- , [T1].[ID2]
-- , [T2].[VAL];


Current Output:

ID1 ID2 VAL
1 1 25000
1 2 25000
1 3 25000
1 4 25000
1 5 25000
1 2 30000
2 1 30000

DROP TABLE [dbo].[T1];
DROP TABLE [dbo].[T2];

Answer

You have a default value in your lookup table. You can approach this using left join and coalesce():

select t1.id1, t2.id2,
       coalesce(t2.val, t2default.val) as val
from t1 left join
     t2
     on t1.id1 = t2.id1 and t1.id2 = t2.id2 left join
     t2 t2default
     on t1.id1 = t2default.id1 and t2default.id2 is null;

Assuming that you have the right indexes on the id columns used in the joins, the performance should be very good.

If you care about performance, an alternative method might be worth a try:

select t1.id1, t1.id2, t2.val
from t1 outer apply
     (select top 1 t2.*
      from t2
      where t2.id1 = t1.id and (t2.id2 is null or t2.id2 = t1.id2)
      order by (case when t2.id2 = t1.id2 then 1 else 2 end)
     ) t2;

This seems more complicated to me, but sometimes apply has surprisingly good performance characteristics.