abhay abhay - 5 months ago 12x
SQL Question

Selecting SQL TOP N rows from group, row_number, rank not working

In SQL Server, I have one table with following fields. I would like to get Top N rows from each group. I have tried following sql query with Rank() and Row_number() but rank and row numbers are not in sequence so not able to limit the rows.

Here is SQL query I tried, as you can see in result, rank column does not have sequential number so "where" clause will not work for top 3 rows for each group.

Similar result observed with ROW_NUMBER(), DENSE_RANK()

SELECT Rs.RunID,rs.SiteCode,procedure_name,rs.total_cpu_time, rank
SELECT RunID, SiteCode,total_cpu_time, procedure_name, rank()
over (Partition BY sitecode
ORDER BY total_cpu_time desc ) AS Rank

FROM TopProcs
-- ) rs WHERE Rank between 1 and 15 and RunID = 1 and SiteCode in ('CAS', 'P01')
) rs WHERE RunID = 1 and SiteCode in ('CAS', 'P01')
order by SiteCode`

Here is output,

RunID SiteCode procedure_name total_cpu_time rank
1 CAS spDRSActivation 117039139161 1
1 CAS spDRSSendChangesForGroup 155827022 2
1 CAS spDRSMsgBuilderActivation 153595640 3
1 CAS spGetChangeNotifications 360607 30
1 CAS spDRSSendSyncComplete 100169 65
1 CAS spSendRcmServiceBrokerMessage 88270 67
1 CAS spLogEntry 53466 78
1 P01 spDRSMsgBuilderActivation 62843590384 9
1 P01 spDRSSendChangesForGroup 62746448352 10
1 P01 spDrsSummarizeSendHistory 54443397908 13
1 P01 CH_SummarizePolicyRequests 35371363957 18
1 P01 spProcessDCMComplianceMsg 29790879064 25
1 P01 spUpdateComplianceDetails 22106121907 26


I am assuming that you want to get the TOP N result for the sitecode partition with total_cpu_time as order. You can put the Where clause inside if you need to filter data afterwords just filter in inner query.

You can use the Row_Number() instead of the rank(),

SELECT rs.RunID,rs.SiteCode,rs.procedure_name,rs.total_cpu_time, rn
            Row_Number() OVER (PARTITION BY SiteCode ORDER BY total_cpu_time DESC) AS rn
        FROM TopProcs 
           RunID = 1 
           AND SiteCode IN('CAS', 'P01')
    ) rs 
    rs.rn < 4

This will return the top 3 rows for each SiteCode.