Superdooperhero Superdooperhero - 4 months ago 14
SQL Question

Oracle SQL how to group by, but have multiple rows if group is repeated at a later date

I have the following query

select paaf.assignment_id,
paaf.position_id,
paaf.effective_start_date effective_start_date,
paaf.effective_end_date effective_end_date
from per_all_assignments_f paaf
where paaf.position_id is not null
and paaf.assignment_type in ('E', 'C')
and paaf.primary_flag = 'Y'
and paaf.assignment_number like '209384%'
order by 3


Which returns

"assignment_id" "position_id" "effective_start_date" "effective_end_date"
6518 5323 01/01/2013 28/02/2014
6518 8133 01/03/2014 30/06/2014
6518 8133 01/07/2014 31/10/2015
6518 239570 01/11/2015 15/11/2015
6518 239570 16/11/2015 31/12/2015
6518 8133 01/01/2016 27/07/2016
6518 8133 28/07/2016 31/12/4712


I grouped this using:

select paaf.assignment_id,
paaf.position_id,
min(paaf.effective_start_date) effective_start_date,
max(paaf.effective_end_date) effective_end_date
from per_all_assignments_f paaf
where paaf.position_id is not null
and paaf.assignment_type in ('E', 'C')
and paaf.primary_flag = 'Y'
and paaf.assignment_number like '209384%'
group by paaf.assignment_id, paaf.position_id


Which returns:

"assignment_id" "position_id" "effective_start_date" "effective_end_date"
6518 5323 01/01/2013 28/02/2014
6518 8133 01/03/2014 31/12/4712
6518 239570 01/11/2015 31/12/2015


But I need a query that returns

"assignment_id" "position_id" "effective_start_date" "effective_end_date"
6518 5323 01/01/2013 28/02/2014
6518 8133 01/03/2014 31/10/2015
6518 239570 01/11/2015 31/12/2015
6518 8133 01/01/2016 31/12/4712


That is to say the position_id of 8133 must have two rows since there are two sections chronologically that must be grouped into 2 rows and not 1 (for 8133).

Is there some way of accomplishing this using the date order?

Answer

This is a gaps-and-islands problem. There are different approaches, but a simple one uses a difference of row number:

with paaf as (<your first query here>
     )
select paaf.assignment_id,
       paaf.position_id,
       min(paaf.effective_start_date) as effective_start_date,
       max(paaf.effective_end_date) as effective_end_date
from (select paaf.*,
             row_number() over (order by effective_start_date) as seqnum,
             row_number() over (partition by position_id order by effective_start_date) as seqnum_p
      from paaf
     ) paaf
group by (seqnum - seqnum_p), position_id, assignment_id;
Comments