rayrick rayrick - 7 months ago 10
SQL Question

SQL: Can't understand how to select from my tables

I need help with a data extraction. I'm an sql noob and I think I have a serious issue with my data design skills. DB system is MYSQL running on Linux.

Table A is structured like this one:

TYPE SUBTYPE ID
-------------------
xyz aaa 0001
xyz aab 0001
xyz aac 0001
xyz aad 0001
xyz aaa 0002
xyz aaj 0002
xyz aac 0002
xyz aav 0002


Table B is:

TYPE1 SUBTYPE1 TYPE2 SUBTYPE2
-------------------------------------
xyz aaa xyz aab
xyz aac xyz aad


Looking at whole table A, I need to extract all rows where both type and subtype are present as columns in a single table B row. Of course this condition is never met since A.subtype can't be at same time equal to B.subtype1 AND B.subtype2 ...

In the example the result set for id should be:

xyz aaa 0001
xyz aab 0001
xyz aac 0001
xyz aad 0001


I m trying to use a join with 2 AND conditions, but of course I got an empty set.

EDIT:

@Barmar thank you for your support. It seems that I m really near the final solution. Just to keep things clear, I opened this thread with a shortened and simplified data structure, just to highlight the point where I was stuck.
I thought about your solution, and is acceptable to have both result on a single row. Now, I need to reduce execution time.

First join takes about 2 minutes to complete, and it produce around 23Million of rows. The second join (table B) is probably taking longer.
In fact, I need 3 hours to have the final set of 10 millions of rows. How can we impove things a bit? I noticed that mysql engine is not threaded, and the query is only using a single CPU. I indexed all fields used by join, but I m not sure its the right thing to do...since I m not a DBA

Probably split things into different query will help parallelism. thanks for a feedback

Answer

You can join Table A with itself to find all combinations of types and subtypes with the same ID, then compare them with the values in Table B.

SELECT t1.type AS type1, t1.subtype AS subtype1, t2.type AS type2, t2.subtype AS subtype2, t1.id
FROM TableA AS t1
JOIN TableA AS t2 ON t1.id = t2.id AND NOT (t1.type = t2.type AND t1.subtype = t2.subtype)
JOIN TableB AS b ON t1.type = b.type1 AND t1.subtype = b.subtype1 AND t2.type = b.type2 AND t2.subtype = b.subtype2

This returns the two rows from Table A as a single row in the result, rather than as separate rows, I hope that's OK. If you need to split them up, you can move this into a subquery and join it back with the original table A to return each row.

SELECT a.*
FROM TableA AS a
JOIN (the above query) AS x
ON a.id = x.id AND
    ((a.type = x.type1 AND a.subtype = x.subtype1)
    OR
     (a.type = x.type2 AND a.subtype = x.subtype2))

DEMO