Antony Smith Antony Smith - 5 months ago 20
SQL Question

SQL Server - Group By, Average and Percentiles

I have a FormSummaries table in SQL Server with the following relevant columns of example data:

FormName | CompletionTime
Form1 | 70
Form1 | 20
Form1 | 30
Form1 | 40
Form1 | 80
Form1 | 60
Form1 | 90
Form1 | 10
Form2 | 30
Form2 | 40
Form2 | 80
Form2 | 90
Form2 | 40
Form2 | 1000
Form2 | 120
Form2 | 70


What I need to do is:

1) Group the data by the form's name and an average of the completion time for that form, easy enough:

SELECT
FormName, AVG(CompletionTime)
FROM
FormSummaries
WHERE
CompletionTime is not null
GROUP BY
FormName


2) Get the average of the top 25%/bottom 25% of completion times for each form type (i.e. the average fastest and slowest 25% of time taken to complete each form). Ideally this would be in one query i.e.

FormName | Bottom25%AverageCompletionTime | Top25%AverageCompletionTime
Form1 | 85 | 15
Form2 | 560 | 35


I live in the real world and realise that may not be possible so separate queries for the top and bottom would be fine i.e.

FormName | Bottom25%AverageCompletionTime
Form1 | 85
Form2 | 560

FormName | Top25%AverageCompletionTime
Form1 | 15
Form2 | 35


I've looked at Partition by, Ntile and Over but I can't seem to be able to get anything to produce the desired result (though that may well be because I'm not implementing these correctly!).

Can anyone help please?

Many thanks.

Answer

NTILE Ranks the results in chunks, so you are interested in quarters, so use NTILE (4) to split into 4 groups, and partition over the formname. To do this with 2 queries, try

-- top 25%
SELECT  formname, AVG(CompletionTime) 
FROM
(SELECT 
    FormName,completiontime, NTILE(4) over (partition by FormName order by completiontime) as QuartPercentile
FROM 
    FormSummaries
WHERE CompletionTime IS NOT NULL )
    x
WHERE  QuartPercentile = 1
GROUP BY formname

-- bottom 25%
SELECT  formname, AVG(CompletionTime) 
FROM
(SELECT 
    FormName,completiontime, NTILE(4) over (partition by FormName order by completiontime) as QuartPercentile
FROM 
    FormSummaries 
WHERE CompletionTime IS NOT NULL)
    x
WHERE  QuartPercentile = 4
GROUP BY formname

Or with one query

SELECT  formname,AVG( case when QuartPercentile = 4 then CompletionTime else null end)   as [Bottom25%AverageCompletionTime]
, AVG( case when QuartPercentile = 1 then CompletionTime else null end)   as [Top25%AverageCompletionTime]
FROM
(SELECT 
    FormName,completiontime, NTILE(4) over (partition by FormName order by completiontime) as QuartPercentile
FROM 
    FormSummaries 
WHERE CompletionTime IS NOT NULL)
    x

GROUP BY formname

Bear in mind that if your completiontime column has integers, AVG will return an integer, so you may want to convert to get the required precision,eg

AVG( case when QuartPercentile = 1 then cast(CompletionTime AS decimal(9,2))  else null end) 
Comments