Buda Gavril Buda Gavril - 1 month ago 5
SQL Question

SQL Server : select correct columns of duplicated records with group by clause

I have the following SQL Server table:

DECLARE @Tbl TABLE
(
Id INT IDENTITY NOT NULL PRIMARY KEY,
MemberId INT,
Guid UNIQUEIDENTIFIER,
ExpiryYear INT,
Hash CHAR(5)
)

INSERT INTO @Tbl (MemberId, Guid, ExpiryYear, Hash)
VALUES
(1, NEWID(), 2017, 'Hash1'),
(1, NEWID(), 2018, 'Hash2'),
(1, NEWID(), 2018, 'Hash2'),
(2, NEWID(), 2020, 'Hash3'),
(2, NEWID(), 2017, 'Hash1')


If I need data grouped by the
ExpiryYear
and Hash, how to get also a record that matches the
Id
and
Guid
? I've tried something like:

select
max(Id) as Id,
max(MemberId) as MemberId,
max(Guid) as Guid,
ExpiryYear,
Hash
from
@Tbl
where
MemberId = 1
group by
ExpiryYear, Hash


But this usually gives me the id of the third record and guid of the second. How to make that I receive the id and guid of the same record?

Here is a wrong output because the id is from the third record, but Guid is from the second record:

enter image description here

I am using SQL Server 2014

Answer

I think you want something like this:

select . . .
from (select t.*,
             row_number() over (partition by ExpiryYear, Hash order by ??) as seqnum
      from @tbl t
     ) t
where seqnum = 1;

The ?? represents the ordering column, which specifies which row to get for the (ExpiryYear, Hash) combination. If you don't care, you can use ExprityYear or (select NULL).