user3642531 user3642531 - 2 months ago 19
SQL Question

Collapsing date ranges in SQL (Netezza)

I have a data set that looks something like this:

Visit ID Admission Date Discharge Date Unit
20 01/01/2015 12:45 01/01/2015 13:57 ER
20 01/03/2015 13:57 01/04/2015 11:57 ER
20 01/04/2015 11:57 01/04/2015 19:32 Trauma
20 01/04/2015 19:32 01/04/2015 21:22 ER


My goal is to get admission/discharge dates for each unit. The issue is that sometimes patients change beds within in the same unit, and this action is labeled as a transfer even though the patient is still within the same unit. So I would like to collapse those date ranges so the output instead looks like this:

Visit ID Admission Date Discharge Date Unit
20 01/01/2015 12:45 01/01/2015 11:57 ER
20 01/04/2015 11:57 01/04/2015 19:32 Trauma
20 01/04/2015 19:32 01/04/2015 21:22 ER


I don't know how to achieve this...I was thinking of what partition I should use but every ranking partition I can think of (rank/dense_rank) will assign the first two ER values the same rank as the last ER value, which would be incorrect.

Basically, my question is the same this is unanswered question: Collapsing date records only if the value doesn't change - Oracle SQL

I'm using Netezza.

Answer

You can use a left join to see if something is connected to the previous record. If there is no connection, then you have the beginning of a "continuous period". Then, a cumulative sum assigns a grouping, which can be used for aggregation.

That is how this query works:

select visitid, unit,
       min(admissiondate) as admissiondate,
       max(dischargedate) as dischargedate
from (select t.*,
             sum(case when tprev.visitid is null then 1 else 0 end) over
                 (partition by t.visitid, t.unit order by t.admissiondate 
                 ) as grp
      from t left join
           t tprev
           on t.visitid = tprev.visitid and t.unit = tprev.unit and
              t.admissiondate = tprev.dischargedate
     ) t
group by grp, visitid, unit;

Note: This assumes that the new admission date is exactly the same as the previous discharge date. Of course, you can introduce non-equality logic if you want to check that the admission occurred within, say 10 seconds or 5 minutes of the discharge.

Comments