Andrew Williams Andrew Williams - 1 month ago 6
SQL Question

Count consecutive NULL values for given ID

I'm trying to create a query that contains a running count of ISO weeks when the account has no entry in a [Volume] table. The query is to return only a sample of the accounts, so I've created a couple of CTEs to limit the number of records and join the volume table so that weeks with no volume show up in the results.

To illustrate the result I'm looking for:

ISOWk | SurrID | Weekly Volume | No vol Count

201601 | 001 | 0 | 1
201601 | 002 | 5 | 0
201602 | 001 | 0 | 2
201602 | 002 | 0 | 1
201603 | 001 | 125| 0
201603 | 002 | 75 | 0
201604 | 001 | 0 | 1
201604 | 002 | 75 | 0


As you can see, account with SurrID 001 has no volume for weeks 201601 and 201602 so has [No vol Count] of 2 in week 201602. In week 201603, there is volume so the counter resets to 0 and increases to 1 week 201604.

From the research I've done I've managed to get a consecutive count running using ROW_NUMBER with a window function, but it doesn't reset if there is volume (as in week 201603 in my example). What I can't figure out is how to count consecutive zero values and reset when needed.

I've included my full query below so that you get the full picture (please point out any particularly bad practices here - I'm still finding my way!). It all works as intended until I include the third CTE "NDs". It then takes 45 mins to return just over 2000 rows and returns an non-resetting count for rows showing no weekly volume.

WITH Surrs AS
(
SELECT SurrID, OracleStartDate AS OSD
FROM (
SELECT ca.SurrID, OracleStartDate, ROW_NUMBER() OVER(ORDER BY OracleStartDate) as rn
FROM tblCustomerAccounts ca
JOIN tblAccountUpdates au
ON ca.SurrID = au.SurrID
WHERE CustomerType_ID IN (1,2,3,4,5,6,7,12)
AND au.ISOWk = 201641
) a
WHERE rn % 1000 = 0
),
Updates AS
(
SELECT au.ISOWk, s.SurrID, (CASE WHEN AccStatus_ID = 1 THEN 'A' ELSE 'I' END) AS AccStatus, (CASE WHEN dbo.udf_ConvertDateToISOWeek(OSD) <= BBC THEN 'B' ELSE 'F' END) AS Book
FROM Surrs s
JOIN (
SELECT ISOWk,
(SELECT BBCutOff FROM dbo.udf_CutOffWeeks(ISOWk)) AS BBC,
(SELECT FYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYS,
(SELECT FYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYE,
(SELECT BYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYS,
(SELECT BYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYE,
SurrID,
AccStatus_ID
FROM tblAccountUpdates
) au
ON au.SurrID = s.SurrID
),
NDs AS
(
SELECT u.ISOWk, u.SurrID, ROW_NUMBER() OVER (PARTITION BY u.SurrID ORDER BY u.ISOWk) AS NDCount
FROM Updates u
LEFT JOIN tblTotalVolumes tv
ON u.SurrID = tv.SurrID
AND u.ISOWk = tv.ISOWk
WHERE tv.Volume IS NULL
AND u.ISOWk >= 201601

)
SELECT tw.ISOWk,
tw.SurrID,
(CASE WHEN Volume IS NULL THEN 0 ELSE Volume END) AS [Weekly Volume],
tw.Book,
tw.AccStatus,
(CASE WHEN tw.AccStatus = 'I' AND lw.AccStatus = 'A' THEN 'Y' ELSE '' END) AS [Stopped this week],
(CASE WHEN tw.AccStatus = 'A' AND lw.AccStatus = 'I' THEN 'Y' ELSE '' END) AS [Restarted this week],
(CASE WHEN NDCount IS NULL THEN 0 ELSE NDCount END) AS [Consecutive ND Weeks]

FROM Updates tw
JOIN Updates lw
ON lw.ISOWk = dbo.udf_ConvertDateToISOWeek(DATEADD("ww",-1,dbo.udf_ConvertISOWkToDate(tw.ISOWk)))
AND tw.SurrID = lw.SurrID
LEFT JOIN tblTotalVolumes tv
ON tw.SurrID = tv.SurrID
AND tw.ISOWk = tv.ISOWk
LEFT JOIN NDs
ON tw.SurrID = nds.SurrID
AND tw.ISOWk = nds.ISOWk

ORDER BY tw.ISOWk


To reiterate what I need: the column [Consecutive ND Weeks] should count consecutive weeks where [Weekly Volume] is 0. Help will be much appreciated.

Thanks

UPDATE:

I've tried to implement @Gordon Linoff's post but my counter does not reset to 0 when there is a value for [Weekly Volume]. Here's my amended query:

SELECT t.*, (CASE WHEN [Weekly Volume] = 0 THEN ROW_NUMBER() OVER (PARTITION BY t.SurrID, grp ORDER BY ISOWk) ELSE 0 END) AS [ND Count]

FROM (
SELECT tw.ISOWk,
s.SurrID,
tw.AccStatus,
(CASE WHEN tv.Volume IS NULL THEN 0 ELSE tv.Volume END) AS [Weekly Volume],
(CASE WHEN dbo.udf_ConvertDateToISOWeek(OSD) <= BBC THEN 'B' ELSE 'F' END) AS Book,
(CASE WHEN tw.AccStatus = 'I' AND lw.AccStatus = 'A' THEN 'Y' ELSE '' END) AS [Stopped this week],
(CASE WHEN tw.AccStatus = 'A' AND lw.AccStatus = 'I' THEN 'Y' ELSE '' END) AS [Restarted this week],
SUM(CASE WHEN tv.volume > 0 THEN 1 ELSE 0 END) OVER(PARTITION BY tv.SurrID ORDER BY tv.ISOWk) AS grp

FROM (
SELECT SurrID, OracleStartDate AS OSD
FROM (
SELECT ca.SurrID, OracleStartDate, ROW_NUMBER() OVER(ORDER BY OracleStartDate) as rn
FROM tblCustomerAccounts ca
JOIN tblAccountUpdates au
ON ca.SurrID = au.SurrID
WHERE CustomerType_ID IN (1,2,3,4,5,6,7,12)
AND au.ISOWk = 201641
) a
WHERE rn % 1000 = 0
) s
JOIN (
SELECT ISOWk,
(SELECT BBCutOff FROM dbo.udf_CutOffWeeks(ISOWk)) AS BBC,
(SELECT FYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYS,
(SELECT FYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYE,
(SELECT BYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYS,
(SELECT BYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYE,
SurrID,
(CASE WHEN AccStatus_ID = 1 THEN 'A' ELSE 'I' END) AS AccStatus
FROM tblAccountUpdates
) tw
ON tw.SurrID = s.SurrID
JOIN (
SELECT ISOWk,
SurrID,
(CASE WHEN AccStatus_ID = 1 THEN 'A' ELSE 'I' END) AS AccStatus
FROM tblAccountUpdates
) lw
ON tw.SurrID = lw.SurrID
AND dbo.udf_ConvertDateToISOWeek(DATEADD("ww",-1,dbo.udf_ConvertISOWkToDate(tw.ISOWk))) = lw.ISOWk
LEFT JOIN tblTotalVolumes tv
ON tw.ISOWk = tv.ISOWk
AND tw.SurrID = tv.SurrID
) t

ORDER BY ISOWk


UPDATE:

I've now modified my query to reflect Vladimir's solution (once again, this is the full query):

SELECT ISOWk,
SurrID,
AccStatus,
[Weekly Volume],
Book,
[Stopped this week],
[Restarted this week],
RN1,
RN2,
grp,
rn3,
(CASE WHEN [Weekly Volume] = 0 THEN rn3 ELSE 0 END) AS [ND Count]
FROM (
SELECT t.ISOWk,
t.SurrID,
t.AccStatus,
t.[Weekly Volume],
t.Book,
t.[Stopped this week],
t.[Restarted this week],
rn1,
rn2,
rn1 - rn2 AS grp,
ROW_NUMBER() OVER(PARTITION BY t.SurrID, rn1-rn2 ORDER BY ISOWk) AS rn3

FROM (
SELECT tw.ISOWk,
s.SurrID,
tw.AccStatus,
(CASE WHEN tv.Volume IS NULL THEN 0 ELSE tv.Volume END) AS [Weekly Volume],
(CASE WHEN dbo.udf_ConvertDateToISOWeek(OSD) <= BBC THEN 'B' ELSE 'F' END) AS Book,
(CASE WHEN tw.AccStatus = 'I' AND lw.AccStatus = 'A' THEN 'Y' ELSE '' END) AS [Stopped this week],
(CASE WHEN tw.AccStatus = 'A' AND lw.AccStatus = 'I' THEN 'Y' ELSE '' END) AS [Restarted this week],
ROW_NUMBER() OVER(PARTITION BY tw.SurrID ORDER BY tw.ISOWk) AS rn1,
ROW_NUMBER() OVER(PARTITION BY tw.SurrID, tv.Volume ORDER BY tw.ISOWk) AS rn2

FROM (
SELECT SurrID, OracleStartDate AS OSD
FROM (
SELECT ca.SurrID, OracleStartDate, ROW_NUMBER() OVER(ORDER BY OracleStartDate) as rn
FROM tblCustomerAccounts ca
JOIN tblAccountUpdates au
ON ca.SurrID = au.SurrID
WHERE CustomerType_ID IN (1,2,3,4,5,6,7,12)
AND au.ISOWk = 201641
) a
WHERE rn % 2000 = 0
) s
JOIN (
SELECT ISOWk,
(SELECT BBCutOff FROM dbo.udf_CutOffWeeks(ISOWk)) AS BBC,
(SELECT FYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYS,
(SELECT FYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS FYE,
(SELECT BYStart FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYS,
(SELECT BYEnd FROM dbo.udf_CutOffWeeks(ISOWk)) AS BYE,
SurrID,
(CASE WHEN AccStatus_ID = 1 THEN 'A' ELSE 'I' END) AS AccStatus
FROM tblAccountUpdates
) tw
ON tw.SurrID = s.SurrID
JOIN (
SELECT ISOWk,
SurrID,
(CASE WHEN AccStatus_ID = 1 THEN 'A' ELSE 'I' END) AS AccStatus
FROM tblAccountUpdates
) lw
ON tw.SurrID = lw.SurrID
AND dbo.udf_ConvertDateToISOWeek(DATEADD("ww",-1,dbo.udf_ConvertISOWkToDate(tw.ISOWk))) = lw.ISOWk
LEFT JOIN tblTotalVolumes tv
ON tw.ISOWk = tv.ISOWk
AND tw.SurrID = tv.SurrID

) t
) x

ORDER BY ISOWk


Here's a sample of the unexpected (in that the counter doesn't reset to 0 when [Weekly Volume] has a value greater than 0, identified with the asterisks) results. The following all have the same ID, so I've removed the ID column.

ISOWk | Weekly Volume | rn1 | rn2 | grp | rn3 | ND Count |
201620 | 0 | 1 | 1 | 0 | 1 | 1 |
201621 | 0 | 2 | 2 | 0 | 2 | 2 |
201622 | 0 | 3 | 3 | 0 | 3 | 3 |
201623 | 0 | 4 | 4 | 0 | 4 | 4 |
201624 | 0 | 5 | 5 | 0 | 5 | 5 |
201625 | 53 | 6 | 1 | 5 | 1 | 0 |
201626 | 49 | 7 | 1 | 6 | 1 | 0 |
201627 | 98 | 8 | 1 | 7 | 1 | 0 |
201628 | 54 | 9 | 1 | 8 | 1 | 0 |
201629 | 53 | 10 | 2 | 8 | 2 | 0 |
201630 | 103 | 11 | 1 | 10 | 1 | 0 |
201631 | 59 | 12 | 1 | 11 | 1 | 0 |
201632 | 35 | 13 | 1 | 12 | 1 | 0 |
201633 | 0 | 14 | 6 | 8 | 3 | 3 |**
201634 | 0 | 15 | 7 | 8 | 4 | 4 |**
201635 | 0 | 16 | 8 | 8 | 5 | 5 |**
201636 | 0 | 17 | 9 | 8 | 6 | 6 |**
201637 | 87 | 18 | 1 | 17 | 1 | 0 |
201638 | 136 | 19 | 1 | 18 | 1 | 0 |
201639 | 56 | 20 | 1 | 19 | 1 | 0 |
201640 | 70 | 21 | 1 | 20 | 0 | 0 |
201641 | 77 | 22 | 1 | 21 | 1 | 0 |


There are other instances of this issue in my dataset.

Answer

It looks like a gaps-and-islands problem.

Sample data

DECLARE @T TABLE(ISOWk int, SurrID char(3), WeeklyVolume int);
INSERT INTO @T(ISOWk, SurrID, WeeklyVolume) VALUES
(201601, '001',  0),
(201601, '002',  5),
(201602, '001',  0),
(201602, '002',  0),
(201603, '001',125),
(201603, '002', 75),
(201604, '001',  0),
(201604, '002', 75),
(201620, '003',  0),
(201621, '003',  0),
(201622, '003',  0),
(201623, '003',  0),
(201624, '003',  0),
(201625, '003', 53),
(201626, '003', 49),
(201627, '003', 98),
(201628, '003', 54),
(201629, '003', 53),
(201630, '003',103),
(201631, '003', 59),
(201632, '003', 35),
(201633, '003',  0),
(201634, '003',  0),
(201635, '003',  0),
(201636, '003',  0),
(201637, '003', 87),
(201638, '003',136),
(201639, '003', 56),
(201640, '003', 70),
(201641, '003', 77),
(201601, '004',  0),
(201602, '004',  6),
(201603, '004',  0),
(201604, '004',  0);

I added your extended sample with SurrID=003 and added mine sample with SurrID=004.

Query

WITH
CTE
AS
(
    SELECT
        ISOWk
        ,SurrID
        ,WeeklyVolume
        ,ROW_NUMBER() OVER (PARTITION BY SurrID ORDER BY ISOWk) AS rn1
        ,ROW_NUMBER() OVER (PARTITION BY SurrID,WeeklyVolume ORDER BY ISOWk) AS rn2
    FROM @T
)
,CTE2
AS
(
    SELECT
        ISOWk
        ,SurrID
        ,WeeklyVolume
        ,rn1
        ,rn2
        ,rn1-rn2 AS grp
        ,ROW_NUMBER() OVER (PARTITION BY SurrID,WeeklyVolume,rn1-rn2 ORDER BY ISOWk) AS rn3
    FROM CTE
)
SELECT
    ISOWk
    ,SurrID
    ,WeeklyVolume
    ,rn1
    ,rn2
    ,grp
    ,rn3
    ,CASE WHEN WeeklyVolume = 0 THEN rn3 ELSE 0 END AS NoVolumeCount
FROM CTE2
ORDER BY SurrID, ISOWk;

Result

+--------+--------+--------------+-----+-----+-----+-----+---------------+
| ISOWk  | SurrID | WeeklyVolume | rn1 | rn2 | grp | rn3 | NoVolumeCount |
+--------+--------+--------------+-----+-----+-----+-----+---------------+
| 201601 |    001 |            0 |   1 |   1 |   0 |   1 |             1 |
| 201602 |    001 |            0 |   2 |   2 |   0 |   2 |             2 |
| 201603 |    001 |          125 |   3 |   1 |   2 |   1 |             0 |
| 201604 |    001 |            0 |   4 |   3 |   1 |   1 |             1 |
| 201601 |    002 |            5 |   1 |   1 |   0 |   1 |             0 |
| 201602 |    002 |            0 |   2 |   1 |   1 |   1 |             1 |
| 201603 |    002 |           75 |   3 |   1 |   2 |   1 |             0 |
| 201604 |    002 |           75 |   4 |   2 |   2 |   2 |             0 |
| 201620 |    003 |            0 |   1 |   1 |   0 |   1 |             1 |
| 201621 |    003 |            0 |   2 |   2 |   0 |   2 |             2 |
| 201622 |    003 |            0 |   3 |   3 |   0 |   3 |             3 |
| 201623 |    003 |            0 |   4 |   4 |   0 |   4 |             4 |
| 201624 |    003 |            0 |   5 |   5 |   0 |   5 |             5 |
| 201625 |    003 |           53 |   6 |   1 |   5 |   1 |             0 |
| 201626 |    003 |           49 |   7 |   1 |   6 |   1 |             0 |
| 201627 |    003 |           98 |   8 |   1 |   7 |   1 |             0 |
| 201628 |    003 |           54 |   9 |   1 |   8 |   1 |             0 |
| 201629 |    003 |           53 |  10 |   2 |   8 |   1 |             0 |
| 201630 |    003 |          103 |  11 |   1 |  10 |   1 |             0 |
| 201631 |    003 |           59 |  12 |   1 |  11 |   1 |             0 |
| 201632 |    003 |           35 |  13 |   1 |  12 |   1 |             0 |
| 201633 |    003 |            0 |  14 |   6 |   8 |   1 |             1 |
| 201634 |    003 |            0 |  15 |   7 |   8 |   2 |             2 |
| 201635 |    003 |            0 |  16 |   8 |   8 |   3 |             3 |
| 201636 |    003 |            0 |  17 |   9 |   8 |   4 |             4 |
| 201637 |    003 |           87 |  18 |   1 |  17 |   1 |             0 |
| 201638 |    003 |          136 |  19 |   1 |  18 |   1 |             0 |
| 201639 |    003 |           56 |  20 |   1 |  19 |   1 |             0 |
| 201640 |    003 |           70 |  21 |   1 |  20 |   1 |             0 |
| 201641 |    003 |           77 |  22 |   1 |  21 |   1 |             0 |
| 201601 |    004 |            0 |   1 |   1 |   0 |   1 |             1 |
| 201602 |    004 |            6 |   2 |   1 |   1 |   1 |             0 |
| 201603 |    004 |            0 |   3 |   2 |   1 |   1 |             1 |
| 201604 |    004 |            0 |   4 |   3 |   1 |   2 |             2 |
+--------+--------+--------------+-----+-----+-----+-----+---------------+

I included intermediate columns in the result, so you can understand how it works.

Standard approach for gaps-and-islands is two sequences of ROW_NUMBER - one is a plain sequence (rn1), second is partitioned by WeeklyVolume (rn2).

The difference between rn1 and rn2 gives the ID of the group (island) (grp). Calculate another sequence of row numbers partitioned by WeeklyVolume and this group (rn3) and use it only when WeeklyVolume is zero.

Obviously, everything above is partitioned by SurrID at first.

In the first variant of the answer I forgot to include WeeklyVolume in the last partitioning for rn3.