Jo Sam Jo Sam - 5 months ago 20
SQL Question

ROW_NUMBER applied not in sequence of rows

USING SQL SERVER 2012
I have a requirement in which I have to clean the source slightly to get the desired output.

I have Vehicle Name, GPSTime, Login/Logoff, DriverID as columns.
Because source data is not clean, we are trying to generate a table out of this which is cleaner with each login having a subsequent logoff.

I have a data like below

Vehicle GPS Login/Logoff Driver


Veh1 28-01-2016 06:30 Login D1
Veh1 28-01-2016 06:35 Login D1
Veh1 28-01-2016 06:40 Login D1
Veh1 28-01-2016 09:40 Logoff D1
Veh1 28-01-2016 10:30 Login D2
Veh1 28-01-2016 12:30 Logoff D2
Veh1 28-01-2016 15:30 Login D1
Veh1 28-01-2016 17:30 Logoff D1


I am trying to get the output as

Veh1 28-01-2016 06:30 Login D1 --> Tricky bit to take the first login
Veh1 28-01-2016 09:40 Logoff D1
Veh1 28-01-2016 10:30 Login D2
Veh1 28-01-2016 12:30 Logoff D2
Veh1 28-01-2016 15:30 Login D1
Veh1 28-01-2016 17:30 Logoff D1


I tried the approach of row_number, ROW_NUMBER() OVER (PARTITION BY DriverID ORDER BY GPSTime asc) but whatever i do its putting a sequence number for all login for that driver continuously though its not the adjacent row. I was planning on using the required sequence number to enhance the logic to derive the desired output.

Is there any other way ? Any suggestions would help.

Regards,
Jo

Answer

Your requirements are not very clear, but it sounds like you want to remove rows that are consecutive logins or logoffs (per driver?), and only keep the first row when the login/logoff status changes.

If that is the case, then using the lag() window function can come in pretty handy.

Here is a query that can give you an idea of how to get this done:

select t.vehicle, t.gps, t.login_logoff, t.driver
  from (select t.*,
               case 
                 when lag(login_logoff) 
                     over (partition by driver 
                               order by gps) = login_logoff
                 then 1 else 0 end as is_duplicate_row
          from tbl t) t
 where t.is_duplicate_row = 0
 order by t.gps
Comments