Fehu Fehu - 6 months ago 12
SQL Question

JOIN between redundant tables

I have problems writing this query.
Table A contains a reference to table B1 OR to table B2.
Table B1 and B2 have a 1 to 1 reference to table C1 or C2, which in turn contain a reference to table D.
I'm able to do it separately for B1 or B2, but not for both in the same query, and the server code requires that the values are passed by a single query.

I need to select the rows from table A that must be summed (via the boolean field), separating the values grouped by table D:

Name | SUM(ValueB1) | SUM(ValueB2)
------------------------------------
NameD1 | 1552 | 654
NameD2 | 564 | 564
NameD3 | 645 | 984


What I tried is:

SELECT
tableD.NameD AS Name,
SUM(IF(tableA.Type = 'B1', tableB1,ValueB1, 0)),
SUM(IF(tableA.Type = 'B2', tableB2,ValueB2, 0))
FROM tableA
INNER JOIN tableB1 ON tableA.ID_tableB1 = tableB1.ID_tableB1
INNER JOIN tableC1 ON tableB1.ID_tableC1 = tableC1.ID_tableC1
INNER JOIN tableD ON tableC1.ID_tableD = tableD.ID_tableD
INNER JOIN tableB2 ON tableA.ID_tableB2 = tableB1.ID_tableB2
INNER JOIN tableC2 ON tableB2.ID_tableC2 = tableC2.ID_tableC2
INNER JOIN tableD ON tableC2.ID_tableD = tableD.ID_tableD
WHERE tableA.Boolean = 'sum'
GROUP BY tableD.ID_tableD

TABLE A
---------------
- ID_tableA
- ID_tableB1
- ID_tableB2
- Type (B1 or B2)
- Boolean (sum / not sum)
- OtherFields

TABLE B1
---------------
- ID_tableB1
- ID_tableC1
- ValueB1
- OtherFields

TABLE B2
---------------
- ID_tableB2
- ID_tableC2
- ValueB2
- OtherFields

TABLE C1
---------------
- ID_tableC1
- ID_tableD

TABLE C2
---------------
- ID_tableC2
- ID_tableD

TABLE D
---------------
- ID_tableD
- NameD


The problem for me is to reference two times "Table D". I can't use alias, and messing with the query, I get nothing or only one sum column. :/

I get the error "Error Code: 1066. Not unique table/alias"

Answer

Since you have the TableD twice, so you just have to give it a different name, e.g. d1 and d2 (that is what the error message is telling you):

...
FROM tableA
LEFT OUTER JOIN tableB1 ON tableA.ID_tableB1 = tableB1.ID_tableB1
INNER JOIN tableC1 ON tableB1.ID_tableC1 = tableC1.ID_tableC1
INNER JOIN tableD d1 ON tableC1.ID_tableD = d1.ID_tableD
LEFT OUTER JOIN tableB2 ON tableA.ID_tableB2 = tableB2.ID_tableB2
INNER JOIN tableC2 ON tableB2.ID_tableC2 = tableC2.ID_tableC2
INNER JOIN tableD d2 ON tableC2.ID_tableD = d2.ID_tableD
WHERE tableA.Boolean = 'sum'
...

And you have to use outer join for A-B1 and A-B2, since one of them probably won't exist.

But actually, i would suggest that you use union:

select NameD AS Name, sum(ValueB1), sum(ValueB2)
from (
  SELECT tableA.boolean, tableD.ID_tableD, tableD.NameD, tableB1.ValueB1 as ValueB1, 0 as ValueB2
  FROM tableA
  INNER JOIN tableB1 ON tableA.ID_tableB1 = tableB1.ID_tableB1
  INNER JOIN tableC1 ON tableB1.ID_tableC1 = tableC1.ID_tableC1
  INNER JOIN tableD ON tableC1.ID_tableD = tableD.ID_tableD
  union all
  SELECT tableA.boolean, tableD.ID_tableD, tableD.NameD, 0, tableB2.ValueB2
  FROM tableA
  INNER JOIN tableB2 ON tableA.ID_tableB2 = tableB2.ID_tableB2
  INNER JOIN tableC2 ON tableB2.ID_tableC2 = tableC2.ID_tableC2
  INNER JOIN tableD ON tableC2.ID_tableD = tableD.ID_tableD
) as sq
WHERE sq.Boolean = 'sum'
GROUP BY sq.ID_tableD, sq.NameD

since it makes it clearer what you are joining (to the reader and the optimizer) and, most importantly, prevents you from summing stuff twice if an entry in A would have a reference to both table B1 and to table B2 (if that is possible in your model).

Comments