Chill Chill - 1 month ago 11
SQL Question

SQL theory: Filtering out duplicates in one column, picking lowest value in other column

I am trying to figure out the best way to remove rows from a result set where either the value in one column or the value in a different column has a duplicate in the result set.

Imagine the results of a query are as follows:

a_value | b_value
-----------------
1 | 1
2 | 1
2 | 2
3 | 1
4 | 3
5 | 2
6 | 4
6 | 5


What I want to do is:


  • Eliminate all rows that have duplicate values in a_value

  • Pick only 1 row for a given b_value



So I'd want the filtered results to end up like this after eliminating a_value duplicates:

a_value | b_value
-----------------
1 | 1
3 | 1
4 | 3
5 | 2


And then like this after picking only a single b_value:

a_value | b_value
-----------------
1 | 1
4 | 3
5 | 2


I'd appreciate suggestions on how to accomplish this task in an efficient way via SQL.

Answer
with
     q_res ( a_value, b_value ) as (
       select 1, 1 from dual union all
       select 2, 1 from dual union all
       select 2, 2 from dual union all
       select 3, 1 from dual union all
       select 4, 3 from dual union all
       select 5, 2 from dual union all
       select 6, 4 from dual union all
       select 6, 5 from dual
     )
-- end test data; solution begins below
select   min(a_value) as a_value, b_value
from     ( 
           select   a_value, min(b_value) as b_value
           from     q_res
           group by a_value
           having   count(*) = 1
         )
group by b_value
order by a_value   -- ORDER BY is optional
;

A_VALUE  B_VALUE
-------  -------
      1        1
      4        3
      5        2