MAV MAV - 2 months ago 13
SQL Question

TSQL Number Rows Based on change in fieldvalue and sorted on date with incremented numbers on duplicates

Say I have a data like the following:

X | 2/2/2000
X | 2/3/2000
B | 2/4/2000
B | 2/10/2000
B | 2/10/2000
J | 2/11/2000
X | 3/1/2000


I would like to get a dataset like this:

1 | X | 2/2/2000
1 | X | 2/3/2000
2 | B | 2/4/2000
2 | B | 2/10/2000
2 | B | 2/10/2000
3 | J | 2/11/2000
4 | X | 3/1/2000


So far everything I have tried has either ended up numbering each change resetting the count on each field value change or in the example leave the last X as 1.

Answer

This is a gaps and islands problem. You can use a difference of row numbers:

select dense_rank() over (order by col1, seqnum_1 - seqnum_2) as col0,
       col1, col2
from (select t.*,
             row_number() over (order by col2) as seqnum_1,
             row_number() over (partition by col1 order by col2) as seqnum_2
      from t
     ) t;

Explaining why this works is a bit cumbersome. If you run the subquery, you will see how the sequence numbers are assigned and why the difference is what you want.

Comments