RAJESH RAJESH - 7 months ago 18
SQL Question

Session Duration and Time on Page calculation using SQL

I have some data in web_event table of Netezza in this below format.

vstr_id | sessn_id | sessn_ts | wbpg_nm
V1 | V1S1 | 02-02-2015 09:20:00 | /home/login
V1 | V1S1 | 02-02-2015 09:30:00 | /home/contacts
V1 | V1S1 | 02-02-2015 09:50:00 | /home/search
V2 | V2S1 | 02-02-2015 09:10:00 | /home
V2 | V2S1 | 02-02-2015 09:15:00 | /home/apps
V2 | V2S2 | 02-02-2015 09:20:00 | /home/news
V2 | V2S2 | 02-02-2015 09:23:00 | /home/news/internal


This is my source table.

I am trying to use that web_event table and create another table like below.

I want the sessn_durtn table and time_on_pg table to be loaded like below.

1) sessn_durtn column : this should be time difference between session start event and session end event according to sorted time field. It can be in represented in minutes or seconds

I am trying to to do

Insert into sessn_durtn (select VSTR_ID,
SESSN_ID,
????? as sessn_durtn,
from web_event)

vstr_id | sessn_id | seesn_durtn
V1 | V1S1 | 30mins
V2 | V2S1 | 5mins
V2 | V2S2 | 3mins


2) time_on_page column : It is the time difference between the current page and next page and the last page of the session can have 0 secs. It can be represented in minutes or seconds.

Insert into time_on_pg (select VSTR_ID,
SESSN_ID,
sessn_ts,
WBPG_NM,
????? as time_on_page
from web_event)

vstr_id | sessn_id | sessn_ts | wbpg_nm | time_on_page
V1 | V1S1 | 02-02-2015 09:20:00 | /home/login | 10mins
V1 | V1S1 | 02-02-2015 09:30:00 | /home/contacts | 20mins
V1 | V1S1 | 02-02-2015 09:50:00 | /home/search | 0mins
V2 | V2S1 | 02-02-2015 09:10:00 | /home | 5mins
V2 | V2S1 | 02-02-2015 09:15:00 | /home/apps | 0mins
V2 | V2S2 | 02-02-2015 09:20:00 | /home/news | 3mins
V2 | V2S2 | 02-02-2015 09:23:00 | /home/news/internal | 0mins


How can we do this in Netezza or any SQL query?

Answer

For the session duration:

SELECT vstr_id, MAX(sessn_ts), MIN(sessn_ts), 
TIMESTAMPDIFF(MINUTE,MIN(sessn_ts),MAX(sessn_ts)) AS sessn_durtn
FROM `web_event`
GROUP BY vstr_id, sessn_id

And for the time on page (you don't have a record of the time they leave, so I can't get the time on page for the last page of the session, so I just set that to 0. If you have that data, you could insert it with a fixed wbpg_nm that won't collide with any others, maybe 'exit' or some such):

SELECT t1.*,
IFNULL(TIMESTAMPDIFF(MINUTE, t1.sessn_ts, t2.sessn_ts), 0) AS time_on_pg
FROM
(
    SELECT w1.*,
    @rownum := @rownum + 1 AS position
    FROM `web_event` w1
    JOIN (SELECT @rownum := 0) r
    ORDER BY vstr_id, sessn_id, sessn_ts
) t1
LEFT JOIN
(
    SELECT w1.*,
    @rownum2 := @rownum2 + 1 AS position
    FROM `web_event` w1
    JOIN (SELECT @rownum2 := 0) r
    ORDER BY vstr_id, sessn_id, sessn_ts
) t2
ON t1.vstr_id = t2.vstr_id 
AND t1.sessn_id = t2.sessn_id 
AND t1.position = t2.position - 1
Comments