Imran Ali Khan Imran Ali Khan - 1 month ago 9
SQL Question

Counting based on string and null

Here is my sample data:

id FirstName LastName HouseNo MyCount
1 A C 1-1 2
2 B C 1-1 2
4 D A 3
5 F A 3
6 J A 3
7 Q X 1-2 3
8 D X 1-2 3
9 D X 1-2 3
10 A C 1-3 3
11 B C 1-3 3
12 C C 1-3 3
14 F K 2
15 J K 2
16 Q X 1-5 1


With the above data I want to take count of records with the same
HouseNo
and
LastName
.

For this I am using

SELECT COUNT(ID) AS _COUNT FROM MYTABLE GROUP BY LASTNAME, HOUSENO


but the statement above has one issue. In the data some of the records don't have
HouseNo
. In the example above ID 4,5,6 and 14,15 don't have
HouseNo
. So, the above statement is returning 5 but it should return 3 and 2 separately.

Main Aim


  1. Take the count based on
    LastName
    and
    HouseNo

  2. Take the count of those records that don't have
    HouseNo
    (They will comes in series).

  3. The coming count should be updated in
    MyCount



How do I get this count?

Edit For Bounty:

Sample Data

id FirstName LastName HouseNo MyCount CountId
1 Imran Khan 1-1
2 Waseem Khan 1-1
3 Rihan Khan 1-1
4 Moiz Shaikh 1-2
5 Zbair Shaikh 1-2
6 Sultan Shaikh 1-2
7 Zaid Khan
10 Parvez Patel 1-3
11 Ahmed Patel 1-3
12 Rahat Syed 1-4
13 Talha Khan
14 Zia Khan
15 Arshad Patel 1-3
16 Samad Patel 1-3
17 Raees Syed 1-4
18 Azmat Khan
19 Imran Khan


Expected Result :

id FirstName LastName HouseNo MyCount CountId
1 Imran Khan 1-1 3 1
2 Waseem Khan 1-1 3 1
3 Rihan Khan 1-1 3 1
4 Moiz Shaikh 1-2 3 2
5 Zbair Shaikh 1-2 3 2
6 Sultan Shaikh 1-2 3 2
7 Zaid Khan 1 3
10 Parvez Patel 1-3 2 4
11 Ahmed Patel 1-3 2 4
12 Rahat Syed 1-4 1 5
13 Talha Khan 2 6
14 Zia Khan 2 6
15 Arshad Patel 1-3 2 7
16 Samad Patel 1-3 2 7
17 Raees Syed 1-4 1 8
18 Azmat Khan 2 9
19 Imran Khan 2 9



  1. In the sample data
    MyCount
    and
    CountId
    are blank and should be filled.

  2. MyCount
    will be based on
    HouseNo
    and
    LastName
    , please see ID 1 to 3, its last name is khan with house no 1-1 so the
    MyCount
    of ID 1 to 3 will be 3, and
    CountId
    will be 1.

  3. In the sample data there are many records that don't have
    HouseNo
    , so for that case same last name in a series will be counted. Please see ID 7, its count will be 1. See also ID 18 and 19, its count will be 2.

  4. CountId
    is a Serial no of id count. Please see ID 1 to 3, it is 1 due to same house no and same last name.


Answer

It looks like the main confusion is caused by your SQL statement at the beginning of the question where you simply GROUP BY LASTNAME, HOUSENO.

If you wanted a simple grouping, your query would be correct. But, then you show us a more detailed sample data with expected result and it becomes clear that you want not just a grouping (which doesn't care about the order of rows in the data), but you want to group rows based on their sequence.

It is a classic problem called gaps-and-islands. In SQL Server 2008 it can be done using few calls to ROW_NUMBER function.

Sample data

DECLARE @T TABLE 
    (id int PRIMARY KEY
    ,FirstName nvarchar(50)
    ,LastName nvarchar(50)
    ,HouseNo nvarchar(50)
    ,MyCount int
    ,CountId int);

INSERT INTO @T (id, FirstName, LastName, HouseNo) VALUES
(1 , 'Imran ', 'Khan  ', '1-1'),
(2 , 'Waseem', 'Khan  ', '1-1'),
(3 , 'Rihan ', 'Khan  ', '1-1'),
(4 , 'Moiz  ', 'Shaikh', '1-2'),
(5 , 'Zbair ', 'Shaikh', '1-2'),
(6 , 'Sultan', 'Shaikh', '1-2'),
(7 , 'Zaid  ', 'Khan  ',  NULL),
(10, 'Parvez', 'Patel ', '1-3'),
(11, 'Ahmed ', 'Patel ', '1-3'),
(12, 'Rahat ', 'Syed  ', '1-4'),
(13, 'Talha ', 'Khan  ',  NULL),
(14, 'Zia   ', 'Khan  ',  NULL),
(15, 'Arshad', 'Patel ', '1-3'),
(16, 'Samad ', 'Patel ', '1-3'),
(17, 'Raees ', 'Syed  ', '1-4'),
(18, 'Azmat ', 'Khan  ',  NULL),
(19, 'Imran ', 'Khan  ',  NULL);

SELECT query

WITH
CTE_RN
AS
(
    SELECT
        id
        ,FirstName
        ,LastName
        ,HouseNo
        ,MyCount
        ,CountId
        ,ROW_NUMBER() OVER (PARTITION BY LastName, HouseNo ORDER BY ID) AS rn1
        ,ROW_NUMBER() OVER (ORDER BY ID) AS rn2
    FROM @T AS T
)
,CTE_GRoups
AS
(
    SELECT
        id
        ,FirstName
        ,LastName
        ,HouseNo
        ,MyCount
        ,CountId
        ,rn1
        ,rn2
        ,rn2-rn1 AS GroupNumber
        ,COUNT(ID) OVER (PARTITION BY LastName, HouseNo, rn2-rn1) AS NewMyCount
        ,MIN(ID) OVER (PARTITION BY LastName, HouseNo, rn2-rn1) AS GroupMinID
    FROM CTE_RN
)
SELECT
    id
    ,FirstName
    ,LastName
    ,HouseNo
    ,rn1
    ,rn2
    ,GroupNumber
    ,NewMyCount
    ,GroupMinID
    ,DENSE_RANK() OVER (ORDER BY GroupMinID) AS NewCountId
FROM CTE_GRoups
ORDER BY ID;

Result

+----+-----------+----------+---------+-----+-----+-------------+------------+------------+------------+
| id | FirstName | LastName | HouseNo | rn1 | rn2 | GroupNumber | NewMyCount | GroupMinID | NewCountId |
+----+-----------+----------+---------+-----+-----+-------------+------------+------------+------------+
|  1 | Imran     | Khan     | 1-1     |   1 |   1 |           0 |          3 |          1 |          1 |
|  2 | Waseem    | Khan     | 1-1     |   2 |   2 |           0 |          3 |          1 |          1 |
|  3 | Rihan     | Khan     | 1-1     |   3 |   3 |           0 |          3 |          1 |          1 |
|  4 | Moiz      | Shaikh   | 1-2     |   1 |   4 |           3 |          3 |          4 |          2 |
|  5 | Zbair     | Shaikh   | 1-2     |   2 |   5 |           3 |          3 |          4 |          2 |
|  6 | Sultan    | Shaikh   | 1-2     |   3 |   6 |           3 |          3 |          4 |          2 |
|  7 | Zaid      | Khan     | NULL    |   1 |   7 |           6 |          1 |          7 |          3 |
| 10 | Parvez    | Patel    | 1-3     |   1 |   8 |           7 |          2 |         10 |          4 |
| 11 | Ahmed     | Patel    | 1-3     |   2 |   9 |           7 |          2 |         10 |          4 |
| 12 | Rahat     | Syed     | 1-4     |   1 |  10 |           9 |          1 |         12 |          5 |
| 13 | Talha     | Khan     | NULL    |   2 |  11 |           9 |          2 |         13 |          6 |
| 14 | Zia       | Khan     | NULL    |   3 |  12 |           9 |          2 |         13 |          6 |
| 15 | Arshad    | Patel    | 1-3     |   3 |  13 |          10 |          2 |         15 |          7 |
| 16 | Samad     | Patel    | 1-3     |   4 |  14 |          10 |          2 |         15 |          7 |
| 17 | Raees     | Syed     | 1-4     |   2 |  15 |          13 |          1 |         17 |          8 |
| 18 | Azmat     | Khan     | NULL    |   4 |  16 |          12 |          2 |         18 |          9 |
| 19 | Imran     | Khan     | NULL    |   5 |  17 |          12 |          2 |         18 |          9 |
+----+-----------+----------+---------+-----+-----+-------------+------------+------------+------------+

Here I included in the result all intermediary steps, so you can see how it works. The main part is two sets of ROW_NUMBERs. The rn1 sequence restarts for each LastName, HouseNo. It is partitioned by LastName, HouseNo. The rn2 is a simple increasing sequence without gaps. We need it, because original ID defines the order, but can have gaps.

Then we subtract these two sequences and difference gives us the GroupNumber.

Counting the number of elements in a group is simple COUNT, which gives us NewMyCount.

Enumerating groups with sequential numbers without gaps is done in two steps. At first MIN gives an identifier for a group, then DENSE_RANK generates a sequence of NewCountId without gaps.


If you want to actually update original table with the calculated NewMyCount and NewCountId, it is easy to turn the SELECT query above into UPDATE query:

UPDATE query

WITH
CTE_RN
AS
(
    SELECT
        id
        ,FirstName
        ,LastName
        ,HouseNo
        ,MyCount
        ,CountId
        ,ROW_NUMBER() OVER (PARTITION BY LastName, HouseNo ORDER BY ID) AS rn1
        ,ROW_NUMBER() OVER (ORDER BY ID) AS rn2
    FROM @T AS T
)
,CTE_GRoups
AS
(
    SELECT
        id
        ,FirstName
        ,LastName
        ,HouseNo
        ,MyCount
        ,CountId
        ,rn1
        ,rn2
        ,rn2-rn1 AS GroupNumber
        ,COUNT(ID) OVER (PARTITION BY LastName, HouseNo, rn2-rn1) AS NewMyCount
        ,MIN(ID) OVER (PARTITION BY LastName, HouseNo, rn2-rn1) AS GroupMinID
    FROM CTE_RN
)
,CTE_Update
AS
(
    SELECT
        id
        ,FirstName
        ,LastName
        ,HouseNo
        ,MyCount
        ,CountId
        ,rn1
        ,rn2
        ,GroupNumber
        ,NewMyCount
        ,GroupMinID
        ,DENSE_RANK() OVER (ORDER BY GroupMinID) AS NewCountId
    FROM CTE_GRoups
)
UPDATE CTE_Update
SET
    MyCount = NewMyCount
    ,CountId = NewCountId
;

Result

SELECT *
FROM @T
ORDER BY ID;

+----+-----------+----------+---------+---------+---------+
| id | FirstName | LastName | HouseNo | MyCount | CountId |
+----+-----------+----------+---------+---------+---------+
|  1 | Imran     | Khan     | 1-1     |       3 |       1 |
|  2 | Waseem    | Khan     | 1-1     |       3 |       1 |
|  3 | Rihan     | Khan     | 1-1     |       3 |       1 |
|  4 | Moiz      | Shaikh   | 1-2     |       3 |       2 |
|  5 | Zbair     | Shaikh   | 1-2     |       3 |       2 |
|  6 | Sultan    | Shaikh   | 1-2     |       3 |       2 |
|  7 | Zaid      | Khan     | NULL    |       1 |       3 |
| 10 | Parvez    | Patel    | 1-3     |       2 |       4 |
| 11 | Ahmed     | Patel    | 1-3     |       2 |       4 |
| 12 | Rahat     | Syed     | 1-4     |       1 |       5 |
| 13 | Talha     | Khan     | NULL    |       2 |       6 |
| 14 | Zia       | Khan     | NULL    |       2 |       6 |
| 15 | Arshad    | Patel    | 1-3     |       2 |       7 |
| 16 | Samad     | Patel    | 1-3     |       2 |       7 |
| 17 | Raees     | Syed     | 1-4     |       1 |       8 |
| 18 | Azmat     | Khan     | NULL    |       2 |       9 |
| 19 | Imran     | Khan     | NULL    |       2 |       9 |
+----+-----------+----------+---------+---------+---------+