kvietok - 1 year ago 87
SQL Question

# sql query - selection of a correct row from a groups specified by id

``````    id  risk    origin  strength    strength_sol
13456   1   1   3   3
13456   134 0   5   NULL
13456   128 0   7   NULL
13456   121 0   5   NULL
13456   122 0   4   NULL
13456   190 0   2   NULL
22367   1   1   5   5
22367   128 0   4   NULL
22367   1   0   2   NULL
22367   36  0   6   NULL
12789   1   1   5   5
12789   1   0   4   NULL
12789   118 1   2   NULL
12789   118 1   5   NULL
12789   1   0   7   NULL
16908   1   0   5   5
16908   36  0   4   NULL
16908   28  1   3   NULL
16908   128 1   5   NULL
16908   1   0   7   NULL
12439   1   0   4   4
12439   134 0   2   NULL
12439   16  0   5   NULL
15678   36  0   4   NULL
15678   28  0   2   NULL
15678   134 0   5   NULL
``````

Problem and data desription:
I have a big dataset. Above you can see just a small sample in order to describe my problem.
I need to choose exactly one row for each id.
In the dataset above there are all the possible cases that can happen.
The last two columns are not a part of the data set, it is the result that I need to get.Origin is a 0/1 variable.

I need to choose this:
for one id:
1. situation: when risk = 1 and origin = 1 - Im ok, I will take this row, there can be the only row like this for one id in the dataset
2.situation: when for one id there is no case that risk=1 and origin =1, I have to choose the row where risk=1 and origin = 0,
if there are more such a rows, it doesnt matter which one I choose(But I have to choose only ONE of them, not all of them).
3. when there in no risk = 1 in any row for one id (doesnt matter what is the value of the origin), I just simply put NULL as strength_sol

my solution is like this (but it is not correct)
case when risk=1 and origin =1 then strength
when risk=1 and origin = 0 then strength
else NULL end as strength

This solution is not correct, because in the situation number one can happen that there is also a row
with risk=1 and origin=0, but I m not interested in that row(I want NULL for that row).
I cant figure out how can I solve it correctly. Thanks.

You can use the `row_number` function to number the rows so that the first row would be the one with highest priority (in this case risk=1 and origin=1) and the second with the next highest priority (risk=1 and origin=0). All the other rows are numbered arbitrarily and then you can choose the first row from each group.

``````select id,risk,origin,strength,
case when rnum=1 then strength end strength_sol
from (select t.*,
row_number() over(partition by id
order by case when risk=1 and origin =1 then 1
when risk=1 and origin =0 then 2
else 3 end) rnum
from t
) x
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download