Oleg Oleg - 29 days ago 10
SQL Question

why I get different number of records when using LEFT JOIN

I need to join many to many relationship. In order to do that I am using

LEFT OUTER JOIN
(SELECT
SICCode, QuoteID
FROM
NetRate_Quote_Insur_Quote_Busin
WHERE
QuoteID IN
(SELECT DISTINCT QuoteID
FROM NetRate_Quote_Insur_Quote_Busin)) nr ON nr.QuoteID = tblQuotes.QuoteID


The goal is to bring the column
SICCode
. For that, I am using
DISTINCT
statement to make relationship many to one. But for some reason I got different number of records.

The last
LEFT OUTER JOIN
is the one that I am stuck with

Full query looks like this:

SELECT
MONTH(INV.EffectiveDate) AS Effective_Month,
tblQuotes.PolicyNumber,
YEAR(INV.EffectiveDate) AS Effective_Year,
INV.EffectiveDate,
INV.[InvoiceDate] as [Billed Date],
tblQuotes.ExpirationDate as [Policy Expiration Date],
INV.DueDate,
dbo.tblFin_InvoiceDetails.AmtBilled AS Written
FROM
tblClientOffices
INNER JOIN
tblInsureds (NOLOCK)
INNER JOIN
tblFin_Invoices INV
INNER JOIN
tblFin_InvoiceDetails ON INV.InvoiceNum = dbo.tblFin_InvoiceDetails.InvoiceNum
INNER JOIN
tblCompanyLines (NOLOCK) ON dbo.tblFin_InvoiceDetails.CompanyLineGuid = dbo.tblCompanyLines.CompanyLineGUID
INNER JOIN
lstLines (NOLOCK) ON dbo.tblCompanyLines.LineGUID = dbo.lstLines.LineGUID
AND dbo.tblCompanyLines.LineGUID = dbo.lstLines.LineGUID
INNER JOIN
tblProducerLocations (NOLOCK) ON INV.ProducerLocationGUID = tblProducerLocations.ProducerLocationGUID
ON tblInsureds.InsuredGUID = INV.InsuredGUID
INNER JOIN
tblQuotes (NOLOCK) ON INV.QuoteID = tblQuotes.QuoteID
AND INV.QuoteControlNum = tblQuotes.ControlNo
INNER JOIN
lstPolicyTypes (NOLOCK) ON tblQuotes.PolicyTypeID = lstPolicyTypes.PolicyTypeID
INNER JOIN
tblSubmissionGroup (NOLOCK) ON tblQuotes.SubmissionGroupGuid = tblSubmissionGroup.SubmissionGroupGUID
INNER JOIN
tblCompanyLocations (NOLOCK) ON tblQuotes.CompanyLocationGuid = tblCompanyLocations.CompanyLocationGUID
INNER JOIN
tblUsers (NOLOCK) ON INV.UnderwriterUserGUID = tblUsers.UserGUID
ON tblClientOffices.OfficeGUID = tblQuotes.IssuingLocationGuid
LEFT OUTER JOIN
tblUsers tblUsers_1 (NOLOCK) ON tblSubmissionGroup.InHouseProducerUserGuid = tblUsers_1.UserGUID
LEFT OUTER JOIN
(SELECT SICCode, QuoteID
FROM NetRate_Quote_Insur_Quote_Busin
WHERE QuoteID IN
(SELECT DISTINCT QuoteID
FROM NetRate_Quote_Insur_Quote_Busin)) nr ON nr.QuoteID = tblQuotes.QuoteID
WHERE
(tblInsureds.TestInsured = 0)
AND (INV.Failed = 0)
AND (tblFin_InvoiceDetails.ChargeType = 'p')
ORDER BY
YEAR(INV.EffectiveDate),
MONTH(INV.EffectiveDate),
lstLines.LineID


What would be the trick to bring Column
SICCode
and keep the same number of records?

I also tried to do this:

LEFT OUTER JOIN NetRate_Quote_Insur_Quote_Busin nr ON nr.QuoteID =
(
SELECT distinct QuoteID from NetRate_Quote_Insur_Quote_Busin
where QuoteID =tblQuotes.QuoteID
)


But it gives me more records than it should be

Also tried this one:

LEFT OUTER JOIN
( SELECT TOP 1 SICCode, QuoteID
FROM NetRate_Quote_Insur_Quote_Busin
WHERE QuoteID IN
(SELECT DISTINCT QuoteID
FROM NetRate_Quote_Insur_Quote_Busin)order by QuoteID desc ) nr ON nr.QuoteID = tblQuotes.QuoteID


But doesnt bring me any SICCode

Answer

DISTINCT will not change a many to many relationship to a many to one. It looks like your data has many SICCode values to one QuoteID which is why you getting higher record counts.

You could use a window function like Rank, but this is an arbitrary selection.

You could try to pivot/unpivot the SICCode column, but that would give you 1 column for each possible value, which would not work if you have a large number of unique values in the column.

A final option is to create a new table which has a QuoteID column and a SICCodes column, then populate it withSICCodes being a comma seperated list of the codes.

In conclusion, there is no easy way to change a many to many relationship to a many to one relationship unless you are willing to lose data or there are only a few values.

Comments