Alexander Holmback Alexander Holmback - 7 months ago 15
SQL Question

Select distinct values from one column based on containment in other column

Suppose I have a RDBMS cross-reference table with the following data:

| id | A_id | B_id |
--------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 7 |
| 5 | 2 | 3 |
| 6 | 3 | 2 |
| 7 | 3 | 3 |


What would be the conventional way of selecting all A_ids whose B_ids is a subset of a given set?

For example, for some set (1,2,3,4,5), I would expect the result:

| A_id |
--------
| 1 |
| 3 |


Since A_id 1 and 3 are the only set of B_ids that is a subset of (1,2,3,4,5).

Answer

Hmmm . . . One way uses aggregation:

select a_id
from t
group by a_id
having sum(case when b_id not in (1, 2, 3, 4, 5) then 1 else 0 end) = 0;

However, assuming you have an a table, then I prefer this method:

select a_id
from a
where not exists (select 1
                  from t
                  where t.a_id = a.a_id and t.b_id not in (1, 2, 3, 4, 5)
                 );

This saves the expense of aggregation and the lookup can take advantage of an appropriate index (on t(a_id, b_id)) so this should have better performance.