bogdanbrudiu bogdanbrudiu - 23 days ago 7
SQL Question

SQL cartesian join problem

I have three tables


  • A: A.pID primary key, A.Name nvarchar(250)

  • B: B.pID primary key, B.Name nvarchar(250)

  • C: C.pID primary key, C.Name nvarchar(250)



There is a m to n relation between A and B (table
lA_B
with primary key
lA_B.pID
and
.pInstanceA
Foreign key to table A and
.pInstanceB
Foreign key to table B)

There is a m to n relation between A and C (
table lA_C
with primary key
lA_C.pID
and
.pInstanceA
Foreign key to table A and
.pInstanceB
Foreign key to table C)


  • A1 is in relation with B1, B2 and C1

  • A2 is in relation with B3 and C2, C3

  • A3 is in relation with B4

  • A4 is in relation with C4

  • A5 has no relation



Here is my SQL:

CREATE TABLE [dbo].[A]( [pID] [bigint] NOT NULL, [Name] [nvarchar](250) NULL )
CREATE TABLE [dbo].[B]( [pID] [bigint] NOT NULL, [Name] [nvarchar](250) NULL )
CREATE TABLE [dbo].[C]( [pID] [bigint] NOT NULL, [Name] [nvarchar](250) NULL)
CREATE TABLE [dbo].[lA_B]( [pID] [bigint] NOT NULL, [pInstanceA] [bigint] NULL, [pInstanceB] [bigint] NULL )
CREATE TABLE [dbo].[lA_C]( [pID] [bigint] NOT NULL, [pInstanceA] [bigint] NULL, [pInstanceB] [bigint] NULL )

INSERT INTO [dbo].[A] ([pID] ,[Name]) VALUES (1,'A1')
INSERT INTO [dbo].[A] ([pID] ,[Name]) VALUES (2,'A2')
INSERT INTO [dbo].[A] ([pID] ,[Name]) VALUES (3,'A3')
INSERT INTO [dbo].[A] ([pID] ,[Name]) VALUES (4,'A4')
INSERT INTO [dbo].[A] ([pID] ,[Name]) VALUES (5,'A5')

INSERT INTO [dbo].[B] ([pID] ,[Name]) VALUES (1,'B1')
INSERT INTO [dbo].[B] ([pID] ,[Name]) VALUES (2,'B2')
INSERT INTO [dbo].[B] ([pID] ,[Name]) VALUES (3,'B3')
INSERT INTO [dbo].[B] ([pID] ,[Name]) VALUES (4,'B4')

INSERT INTO [dbo].[C] ([pID] ,[Name]) VALUES (1,'C1')
INSERT INTO [dbo].[C] ([pID] ,[Name]) VALUES (2,'C2')
INSERT INTO [dbo].[C] ([pID] ,[Name]) VALUES (3,'C3')
INSERT INTO [dbo].[C] ([pID] ,[Name]) VALUES (4,'C4')

INSERT INTO [dbo].[lA_B] ([pID],[pInstanceA],[pInstanceB]) VALUES (1,1,1)
INSERT INTO [dbo].[lA_B] ([pID],[pInstanceA],[pInstanceB]) VALUES (2,1,2)
INSERT INTO [dbo].[lA_B] ([pID],[pInstanceA],[pInstanceB]) VALUES (3,2,3)
INSERT INTO [dbo].[lA_B] ([pID],[pInstanceA],[pInstanceB]) VALUES (4,3,4)

INSERT INTO [dbo].[lA_C] ([pID],[pInstanceA],[pInstanceB]) VALUES (1,1,1)
INSERT INTO [dbo].[lA_C] ([pID],[pInstanceA],[pInstanceB]) VALUES (2,2,2)
INSERT INTO [dbo].[lA_C] ([pID],[pInstanceA],[pInstanceB]) VALUES (3,2,3)
INSERT INTO [dbo].[lA_C] ([pID],[pInstanceA],[pInstanceB]) VALUES (4,4,4)


this query:

SELECT
A.Name AS A,
B.Name AS B,
C.Name AS C
FROM
A
left JOIN lA_B ON (A.pID = lA_B.pInstanceA)
left JOIN B ON (B.pID = lA_B.pInstanceB)
left JOIN lA_C ON (A.pID = lA_C.pInstanceA)
left JOIN C ON (C.pID = lA_C.pInstanceB)


returns


A1 B1 C1
A1 B2 C1
A2 B3 C2
A2 B3 C3
A3 B4 NULL
A4 NULL C4
A5 NULL NULL


And now the question :-)
how to query to receive


A1 B1 NULL
A1 B2 NULL
A1 NULL C1
A2 B3 NULL
A2 NULL C2
A2 NULL C3
A3 B4 NULL
A4 NULL C4
A5 NULL NULL


The problem is that when I make the join both with B and with C the result has all the combinations of B C. How can I eliminate this?

Answer Source

You might be able to do that with a UNION:

SELECT A.Name AS A, B.Name AS B, NULL AS C 
FROM A
left JOIN lA_B ON (A.pID=lA_B.pInstanceA) 
left JOIN B ON (lA_B.pInstanceB=B.pID) 
UNION
SELECT A.Name AS A, NULL AS B, C.Name AS C 
FROM A
left JOIN lA_C ON (A.pID=lA_C.pInstanceA) 
left JOIN C ON (lA_C.pInstanceB=C.pID)

The first part selects all combinations of A and B, the second part all combinations of A and C.

If you wish to filter out rows like (A4,NULL,NULL) because there is already a row (A4,NULL,C4), try this query:

SELECT A.Name AS A, B.Name AS B, NULL AS C 
FROM A
LEFT JOIN lA_B ON (A.pID=lA_B.pInstanceA) 
LEFT JOIN B ON (lA_B.pInstanceB=B.pID) 
WHERE b.name is not null 
    or not exists(select * from lA_C where A.pID=lA_C.pInstanceA) 
UNION
SELECT A.Name AS A, NULL AS B, C.Name AS C 
FROM A
LEFT JOIN lA_C ON (A.pID=lA_C.pInstanceA) 
LEFT JOIN C ON (lA_C.pInstanceB=C.pID)
WHERE c.name is not null 
ORDER BY A,B,C

For the join on B, this says to include rows that have a match in B, or for which there is no match in C. The join on C includes only rows which match in C. Rows that do not match either would get included from the join on B.

Note that UNION filters out duplicate rows, like DISTINCT. To include every row, you can use UNION ALL.