SimonSimCity - 6 months ago 21

SQL Question

I have a question quite similar to How to add sequence number for groups in a SQL query without temp tables

To take the sample from there, I have a data-structure that consists of the first two columns of this table, and I want to generate the third column in my query:

Record Group GroupSequence

-------|---------|--------------

1 Chickens 1

2 Chickens 1

3 Horses 2

4 Cows 3

5 Horses 4

6 Horses 4

The difference to the quoted query is, that I need a new number if the previous line didn't have the same value in the second column.

I tried adding a row-number and using

`LAG()`

The value just needs to be a different number - it doesn't matter if it's in order. This would also be fine:

Record Group GroupSequence

-------|---------|--------------

1 Chickens 1

2 Chickens 1

3 Horses 3

4 Cows 4

5 Horses 5

6 Horses 5

Answer

I think this is what you need:

```
WITH Src AS
(
SELECT * FROM (VALUES
(1, 'Chickens'),
(2, 'Chickens'),
(3, 'Horses '),
(4, 'Cows '),
(5, 'Horses '),
(6, 'Horses '))T(Record, [Group])
), Differentiator AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY Record) -
RANK() OVER (PARTITION BY [Group] ORDER BY Record) Diff
FROM Src
)
SELECT Record, [Group], DENSE_RANK() OVER (ORDER BY [Group],Diff) NewGroup
FROM Differentiator
ORDER BY Record
```

It produces following table:

```
Record Group NewGroup
------ ----- --------
1 Chickens 1
2 Chickens 1
3 Horses 3
4 Cows 2
5 Horses 4
6 Horses 4
```

**Short explanation:**

The key is to calculate relative positions of records in whole table and in each '[Group]' group. If records are adjacent, global number is increasing by 1 and local number is increased by 1. Thus, `ROW_NUMBER() - RANK()`

is the same for all records. If there is a gap, there is also distortion in global numbering. It leads to different numbers generated by `ROW_NUMBER() - RANK()`

in separated groups.