Kyle Jorgensen Kyle Jorgensen - 5 months ago 14
SQL Question

MS Access query with multiple joins returning duplicate rows

I am working with Access to query multiple tables that have passenger info in them. I've been able to distill this down to what I think is causing the issue, but I am unsure how to fix it.

Essentially there are three tables:

PASSENGERS

ID | PASSENGER_NAME | COST_CENTER
-------------------------------------
1 | John Wright | 31231
2 | Cheryl Brown | 54555
3 | Adam Yang | 65655


FARE_LEVEL

ID | TICKET_NUMBER | PASSENGER_NAME | TICKET_AMT | IS_REFUND
---------------------------------------------------------------
1 | 14325435 | John Wright | $632.64 | 0
2 | 46746745 | Adam Yang | $797.32 | 0
3 | 45354434 | Cheryl Brown | $2331.00 | 0
4 | 67876456 | Cheryl Brown | $990.11 | 0
5 | 34654546 | Adam Yang | $552.71 | 0
6 | 14325435 | John Wright | -$632.64 | 1
7 | 87989879 | John Wright | $123.11 | 0
8 | 99124324 | Adam Yang | $1114.42 | 0
9 | 77231235 | Adam Yang | $6232.32 | 0


INCENTIVE_LOG

ID | PASSENGER_NAME | INCENTIVE_AMT
-------------------------------------
1 | Adam Yang | $1000.00
2 | Cheryl Brown | $1000.00
3 | John Wright | $1000.00
4 | John Wright | $1000.00
5 | John Wright | $1000.00


I am using this query to join them:

SELECT
PASSENGERS.PASSENGER_NAME,
Sum(FARE_LEVEL.TICKET_AMT) AS SumOfFARES,
Count(INCENTIVE_LOG.INCENTIVE) AS CountOfINCENTIVE,
Sum(INCENTIVE_LOG.INCENTIVE) AS SumOfINCENTIVE
FROM (PASSENGERS LEFT OUTER JOIN FARE_LEVEL ON PASSENGERS.PASSENGER_NAME = FARE_LEVEL.PASSENGER_NAME)
INNER JOIN INCENTIVE_LOG ON PASSENGERS.PASSENGER_NAME = INCENTIVE_LOG.PASSENGER_NAME
GROUP BY PASSENGERS.PASSENGER_NAME;


The result I am looking for is the sum of all fares for each passenger in the PASSENGER table as well as the sum of all incentives taken for each passenger in the PASSENGER table. So for "John Wright" I would hope to see

PASSENGER_NAME | SumOfFares | CountOfIncentive | SumOfIncentive
---------------------------------------------------------------
John Wright | $123.11 | 3 | $3000.00


Instead I am getting get a result that seems to double count. When I change the GROUP BY to

GROUP BY PASSENGERS.PASSENGER_NAME, INCENTIVE_LOG.ID, FARE_LEVEL.ID;


I can see all of the duplicated rows that it is summing in error. It appears to be creating a row for every combination of FARE_LEVEL and INCENTIVE row that corresponds to a passenger so if someone had 7 fares and 3 incentives it would create 21 rows.

I am fairly certain the issue is in my join, but I am not sure how to fix it.

EDIT:
I was able to solve the issue by creating two subqueries within the original query


SELECT
PASSENGERS.PASSENGER_NAME,
Sum(Query4.SumOfTICKET_AMT) AS SumOfFARES,
Sum(Query2.SumOfINCENTIVE) AS Incentive
FROM (PASSENGERS
LEFT OUTER JOIN (
SELECT PASSENGERS.PASSENGER_NAME, Sum(FARE_LEVEL.TICKET_AMT) AS SumOfTICKET_AMT
FROM PASSENGERS LEFT OUTER JOIN FARE_LEVEL ON PASSENGERS.PASSENGER_NAME = FARE_LEVEL.PASSENGER_NAME
GROUP BY PASSENGERS.PASSENGER_NAME
) AS Query4
ON PASSENGERS.PASSENGER_NAME = Query4.PASSENGER_NAME)
INNER JOIN (
SELECT PASSENGERS.PASSENGER_NAME, Sum(INCENTIVE_LOG.INCENTIVE) AS SumOfINCENTIVE
FROM PASSENGERS LEFT OUTER JOIN INCENTIVE_LOG ON PASSENGERS.PASSENGER_NAME = INCENTIVE_LOG.PASSENGER_NAME
GROUP BY PASSENGERS.PASSENGER_NAME
) AS Query2
ON PASSENGERS.PASSENGER_NAME = Query2.PASSENGER_NAME
GROUP BY PASSENGERS.PASSENGER_NAME;

Answer

You have to sum the incentives first:

SELECT PASSENGERS.PASSENGER_NAME,
     Count(INCENTIVE_LOG.INCENTIVE) AS CountOfINCENTIVE,
     Sum(INCENTIVE_LOG.INCENTIVE) AS SumOfINCENTIVE
FROM (select PASSENGERS.PASSENGER_NAME, 
           Sum(FARE_LEVEL.TICKET_AMT) AS SumOfFARES
      From PASSENGERS 
           LEFT OUTER JOIN FARE_LEVEL 
                ON PASSENGERS.PASSENGER_NAME = FARE_LEVEL.PASSENGER_NAME
      ) x
     INNER JOIN INCENTIVE_LOG 
          ON x.PASSENGER_NAME = INCENTIVE_LOG.PASSENGER_NAME
GROUP BY PASSENGERS.PASSENGER_NAME;

Also, you can just create a first query that just calculates the summed incentives then use that query in this summary.