DenStudent DenStudent - 7 days ago 9
SQL Question

SQL Server partition by gives duplicate records

I have following table:

Date | ID | firstname
---------+----+------------
20161128 | 1 | Adam
20161128 | 2 | Steve
20161128 | 2 | Steve
20161128 | 3 | Aaron
20161129 | 1 | Adam
20161129 | 2 | Steve
20161129 | 2 | Steve
20161129 | 3 | Aaron


I want to get the first row by ID for one particular date.

So what I had was:

SELECT *
FROM tableA
WHERE Date = 20161128


this however, gives all records. So I used the partition over function:

SELECT
*,
row_number() over(partition by ID order by Date desc)
FROM tableA
WHERE Date = 20161128


In this case, I get following result:

Date | ID | firstname | rownum
---------+----+-----------+-------
20161129 | 1 | Adam | 1
20161129 | 1 | Adam | 2
20161129 | 2 | Steve | 1
20161129 | 2 | Steve | 2
20161129 | 2 | Steve | 3
20161129 | 2 | Steve | 4
20161129 | 2 | Steve | 5
20161129 | 2 | Steve | 6
20161129 | 3 | Aaron | 1
20161129 | 3 | Aaron | 2


As you can see, Most ID's appear 2 times. (ID 2 even appears 6 times). In other cases, I see a record appear 10 times even though it would only have one record if I used the first query.

Any idea why this happens and how this can be fixed? My guess would be the date/where clause, but I don't see how this can effect the result this much.

Answer

You need a WHERE clause if you want to filter the records:

SELECT a.*
FROM (SELECT a.*,
             row_number() over(partition by ID order by Date desc) as seqnum
      FROM tableA a
      WHERE a.Date = '20161128'
    ) a
WHERE seqnum = 1;

This will return one row per date per id number.