zaptask zaptask - 7 months ago 18
SQL Question

Cumulative distinct count filtered by last value - T-SQL

I am trying to come up with exactly the same answer as here:

Cumulative distinct count filtered by last value - DAX

but in SQL Server. For convenience I am copying the whole problem description.

I have a dataset:

month name flag
1 abc TRUE
2 xyz TRUE
3 abc TRUE
4 xyz TRUE
5 abc FALSE
6 abc FALSE


I want to calculate month-cumulative distinct count of 'name' filtered by last 'flag' value (TRUE). I.e. I want to have a result:

month count
1 1
2 2
3 2
4 2
5 1
6 1


In months 5 and 6 'abc' should be excluded because the flag switched to 'FALSE' in month 5.

I am thinking about using "over" clause with "partition by" but I don't have any experience here so it's a struggle for me.

Answer

You can do a cumulative distinct count as:

select t.*,
       sum(case when seqnum = 1 then 1 else 0 end) over (order by month) as cnt
from (select t.*,
             row_number() over (partition by name order by month) as seqnum
      from t
     ) t;

I don't understand the logic for incorporating the flag.

You can replicate the results in the question by incorporating the flag:

      select t.*,
             sum(case when seqnum = 1 and flag = 'true' then 1
                      when seqnum = 1 and flag = 'false' then -1
                      else 0
                 end) over (order by month) as cnt
      from (select t.*,
                   row_number() over (partition by name, flag order by month) as seqnum
            from t
           ) t;