user6745154 user6745154 - 3 months ago 23
SQL Question

Grouping continuous sessions over multiple rows in sql

I am trying to look at session lengths to see how long users are logged in for consecutive minutes. The problem is sessions are broken up into rows by activity. If the Activity end is the same as the next rows activity start then they are a part of the same session and should be counted together.

Ideally it would look like this, I have ACTIVITY_START and ACTIVITY_END and would like to create SESSION:

ACTVITY_START | ACTIVITY_END | SESSION
----------------|-----------------|----------
2/16/2016 19:00 | 2/16/2016 20:51 | 1
2/16/2016 20:51 | 2/16/2016 20:52 | 1
2/16/2016 20:52 | 2/16/2016 20:54 | 1
2/16/2016 20:54 | 2/16/2016 21:25 | 1
2/16/2016 21:25 | 2/16/2016 21:26 | 1
2/16/2016 21:26 | 2/16/2016 22:13 | 1
2/16/2016 22:13 | 2/16/2016 22:14 | 1
2/16/2016 22:14 | 2/16/2016 22:41 | 1
2/18/2016 21:59 | 2/18/2016 23:07 | 2
2/18/2016 23:07 | 2/19/2016 0:00 | 2
2/19/2016 0:00 | 2/19/2016 1:56 | 2
2/19/2016 1:56 | 2/19/2016 1:58 | 2
2/19/2016 19:08 | 2/19/2016 20:53 | 3
2/19/2016 20:53 | 2/20/2016 0:00 | 3
2/20/2016 0:00 | 2/20/2016 0:05 | 3
2/20/2016 0:05 | 2/20/2016 2:00 | 3
2/20/2016 2:00 | 2/20/2016 2:12 | 3
2/20/2016 2:12 | 2/20/2016 2:28 | 3
2/20/2016 2:28 | 2/20/2016 2:32 | 3
2/20/2016 12:38 | 2/20/2016 14:16 | 4
2/20/2016 14:26 | 2/20/2016 14:27 | 5


Originally what I did was line up the activity end with the next activity start in the same row and did:

SELECT DENSE_RANK() OVER (ORDER BY CASE WHEN A.ACTIVITY_END = B.ACTIVITY_START THEN 0 ELSE 1 END)


But that ranks all sessions together separate from all times when the session changes.

How do I get it so SESSION increments up each time a new session starts?

Answer

Even though you tagged your question as using a MySQL database, the fact that you performed a "successful" query using the dense_rank() window function tells me that it's definitely not MySQL. MySQL doesn't currently support window functions.

Still, based on the fact that, whatever your database is, it supports windows functions, I'll go ahead and assume that it includes support for the lag and sum window functions. If that's the case, you can use the lag function to identify which rows are not a continuation of the previous row. And then you can use a cumulative sum to generate the session ids based on the values generated by the lag function:

with cte as (
  select a.*,
         case when a.activity_start = lag(a.activity_end) over (order by a.activity_start)
              then 0 else 1 end as grp_id
    from activity a
)
select activity_start,
       activity_end,
       sum(grp_id) over (order by activity_start) as session_id
  from cte
   order by activity_start