Peete Peete - 13 days ago 8
SQL Question

SQL Server: how to find the record where a field is X for the first time and there are no later records where it isn't

I tried for quite some time now but cannot figure out how to best do this without using cursors. What I want to do (in SQL Server) is:


  • Find the earliest (by Date) record where Criterion=1 AND NOT followed by Criterion=0 for each Name and Category.



Or expressed differently:


  • Find the Date when Criterion turned 1 and not turned 0 again afterwards (for each Name and Category).



Some sort of CTE would seem to make sense I guess but that's not my strong suit unfortunately. So I tried nesting queries to find the latest record where Criterion=0 and then select the next record if there is one but I'm getting incorrect results. Another challenge with this is returning a record where there are only records with Criterion=1 for a Name and Category.

Here's the sample data:

Name Category Criterion Date
------------------------------------------------
Bob Cat1 1 22.11.16 08:54 X
Bob Cat2 0 21.02.16 02:29
Bob Cat3 1 22.11.16 08:55
Bob Cat3 0 22.11.16 08:56
Bob Cat4 0 21.06.12 02:30
Bob Cat4 0 18.11.16 08:18
Bob Cat4 1 18.11.16 08:19
Bob Cat4 0 22.11.16 08:20
Bob Cat4 1 22.11.16 08:50 X
Bob Cat4 1 22.11.16 08:51
Hannah Cat1 1 22.11.16 08:54 X
Hannah Cat2 0 21.02.16 02:29
Hannah Cat3 1 22.11.16 08:55
Hannah Cat3 0 22.11.16 08:56


The rows with an X after the row are the ones I want to retrieve.

It's probably not all that complicated in the end...

Answer

If you just want the name, category, and date:

select name, category, min(date)
from t
where criterion = 1 and
      not exists (select 1
                  from t t2
                  where t2.name = t.name and t2.category = t.category and
                        t2.criterion = 0 and t2.date >= t.date
                 )
group by name, category;

There are fancier ways to get this information, but this is a relatively simple method.

Actually, the fancier ways aren't particularly complicated:

select t.*
from (select t.*,
             min(case when date > maxdate_0 or maxdate_0 is NULL then date end) over (partition by name, category) as mindate_1
      from (select t.*,
                   max(case when criterion = 0 then date end) over (partition by name, category) as maxdate_0
            from t
           ) t
      where criterion = 1
     ) t
where mindate_1 = date;

EDIT:

SQL Fiddle doesn't seem to be working these days. The following is working for me (using Postgres):

with t(name, category, criterion, date) as (
    values ('Bob', 'Cat1', 1, '2016-11-16 08:54'),
           ('Bob', 'Cat2', 0, '2016-02-21 02:29'), 
           ('Bob', 'Cat3', 1, '2016-11-16 08:55'), 
           ('Bob', 'Cat3', 0, '2016-11-16 08:56'), 
           ('Bob', 'Cat4', 0, '2012-06-21 02:30'), 
           ('Bob', 'Cat4', 0, '2016-11-18 08:18'), 
           ('Bob', 'Cat4', 1, '2016-11-18 08:19'),
           ('Bob', 'Cat4', 0, '2016-11-22 08:20'),    
           ('Bob', 'Cat4', 1, '2016-11-22 08:50'),  
           ('Bob', 'Cat4', 1, '2016-11-22 08:51'), 
           ('Hannah', 'Cat1', 1, '2016-11-22 08:54'),    
           ('Hannah', 'Cat2', 0, '2016-02-21 02:29'),   
           ('Hannah', 'Cat3', 1, '2016-11-22 08:55'),
           ('Hannah', 'Cat3', 0, '2016-11-22 08:56')
          )
select t.*
from (select t.*,
             min(case when date > maxdate_0 or maxdate_0 is NULL then date end) over (partition by name, category) as mindate_1
      from (select t.*,
                   max(case when criterion = 0 then date end) over (partition by name, category) as maxdate_0
            from t
           ) t
      where criterion = 1
     ) t
where mindate_1 = date;
Comments