Patrick Gregorio Patrick Gregorio - 2 months ago 5x
SQL Question

SQL Show Dates Per Status

I have a table that looks like so:

id animal_id transfer_date status_from status_to
100 5265 01-Jul-2016 NULL P
101 5265 22-Jul-2016 P A
102 5265 26-Jul-2016 A B
103 5265 06-Aug-2016 B A

I want to create a view to show me the movement of the animal with start and end dates like the following:

animal_id status start_date end_date
5265 NULL NULL 30-Jun-2016
5265 P 01-Jul-2016 21-Jul-2016
5265 A 22-Jul-2016 25-Jul-2016
5265 B 26-Jul-2016 05-Aug-2016
5265 A 06-Aug-2016 SYSDATE OR NULL (current status)

As much as I want to provide a query that I've tried, I have none. I don't even know what to search for.


Something like this may be more efficient than a join. Alas, I didn't see a way to avoid scanning the table twice.

NOTE: I didn't use an ORDER BY clause (and indeed, if I had the ordering would be weird, since I used to_char on the dates to format them). If you need this in further processing, it is best to NOT wrap the dates within to_char.

   input_data ( id, animal_id, transfer_date, status_from, status_to) as (
     select 100, 5265, to_date('01-Jul-2016', 'dd-Mon-yyyy'), null, 'P' from dual union all
     select 101, 5265, to_date('22-Jul-2016', 'dd-Mon-yyyy'), 'P' , 'A' from dual union all
     select 102, 5265, to_date('26-Jul-2016', 'dd-Mon-yyyy'), 'A' , 'B' from dual union all
     select 103, 5265, to_date('06-Aug-2016', 'dd-Mon-yyyy'), 'B' , 'A' from dual
select animal_id,
       lag (status_to) over (partition by animal_id order by transfer_date) as status,
       to_char(lag (transfer_date) over (partition by animal_id order by transfer_date),
                                                             'dd-Mon-yyyy') as start_date,
       to_char(transfer_date - 1, 'dd-Mon-yyyy') as  end_date
  from input_data
union all
select     animal_id, 
           max(status_to) keep (dense_rank last order by transfer_date), 
           to_char(max(transfer_date), 'dd-Mon-yyyy'), 
  from     input_data
  group by animal_id

---------- ------ -------------------- --------------------
      5265                             30-Jun-2016         
      5265 P      01-Jul-2016          21-Jul-2016         
      5265 A      22-Jul-2016          25-Jul-2016         
      5265 B      26-Jul-2016          05-Aug-2016         
      5265 A      06-Aug-2016

Added: Explanation of how this works. First, there is a "WITH clause" to create the input data from the OP's message; this is a standard technique, anyone who is not familiar with factored subqueries (CTE, WITH clause) - introduced in Oracle 11.1 - will do themselves (and the rest of us!) a lot of good by reading about it/them.

The query joins together rows from two sources. In one branch, I use the lag() analytic function; it orders rows, within each group by the columns in the "partition by" clause, according to the ordering by the column in the "order by" clause. So for example, the lag(status_to) will look at all the rows within the same animal_id, it will order them by transfer_date, and for each row, it will pick the status_to from the PREVIOUS row (hence "lag"). The rest of that part of the union works similarly.

I have a second part to the union... as you can see in the original post, there are four rows, but the output must have five. In general that suggests a union of some sort will be needed somewhere in the solution (either directly and obviously as in my solution, or via a self-join or in any other way). Here I just another row for the last status (which is still "current"). I use dense_rank last which, within each group (how shown in a GROUP BY), selects just the last row by transfer_date.

To understand how the query works, it may help to, first, comment out the lines union all and select... group by animal_id and run what's left. That will show what the first part of the query does. Then un-comment those lines, and instead comment the first part, from the first select animal_id to union all (comment these two lines and everything in between). Run the query again, this will show just the last row for each animal_id.

Of course, in the sample the OP provided there is only one animal_id; if you like, you can add a few more rows (for example in the WITH clause) with different animal_id. Only now the partition by animal_id and the group by animal_id become important; with only one animal_id they wouldn't be needed (for example, if all the rows are already filtered by WHERE animal_id = 5265 somewhere else in a subquery).