user3206440 user3206440 - 1 month ago 6
SQL Question

SQL - aggregate functions cannot be nested error in postgresql

table1

date_time | make | model | miles | reg_no | age_months
----------------------------------------------------------------------
2016-09-28 20:05:03.001 | toyota | prius | 10200 | 1111 | 22
2016-09-28 20:06:03.001 | suzuki | sx4 | 10300 | 1122 | 12
2016-09-28 20:09:03.001 | suzuki | sx4 | 11200 | 1133 | 34
2016-09-28 20:10:03.001 | toyota | prius | 15200 | 1144 | 28
2017-05-28 20:11:03.001 | toyota | prius | 15500 | 1144 | 36


For data in table1 above I wanted do some aggregations on
miles
per
month
by model like mean, median, q1, q3, iqr etc.

The query I've is as follows, but it gives error:
aggregate functions cannot be nested
- what is the right way to go about this ?

select
model
, COUNT(DISTINCT reg_no) AS distinct_car_count
, COUNT(*) AS records_count
, ROUND(AVG(miles/age_months*1.0),2) AS miles_per_month_avg
, ROUND(PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY (miles/age_months*1.0) ASC),2) AS miles_per_month_med
, ROUND(PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY (miles/age_months*1.0) ASC),2) AS miles_per_month_q1
, ROUND(PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY (miles/age_months*1.0) ASC),2) AS miles_per_month_q3
, miles_per_month_q3 - miles_per_month_q1 as miles_per_month_iqr
, sum(case when miles/age_months*1.0 < (miles_per_month_q1 - 1.5*miles_per_month_iqr) then 1 else 0 end) as miles_per_month_num_records_outliers_lower_bound
, sum(case when miles/age_months*1.0 > (miles_per_month_q3 + 1.5*miles_per_month_iqr) then 1 else 0 end) as miles_per_month_records_outliers_upper_bound
, ROUND(stddev_pop(miles/age_months*1.0),2) as miles_per_month_stddev

from table1 a
group by model;

Answer

There are two issues:

#1: You can't nest aggregates (as the error message clearly indicates), miles_per_month_q1 is an aggregated column and yu try to use it in another aggregate miles_per_month_num_records_outliers_lower_bound.

#2: You try to reuse a column alias miles_per_month_q1 within the calculation of miles_per_month_iqr, which is not allowed in Standard SQL.

For both cases you need to add another nesting level (i.e. Derived Table or Common Table Expression), in your case it's probably:

SELECT
    a.model
    , Count(DISTINCT reg_no) AS distinct_car_count
    , Count(*) AS records_count
    , Round(Avg(miles/age_months*1.0),2) AS miles_per_month_avg

    -- now you can use the aliases, but you have to add a dummy (it's always the same value for a given model) aggregation function like MIN or MAX    
    , Min(percentiles.miles_per_month_med)
    , Min(percentiles.miles_per_month_q1)
    , Min(percentiles.miles_per_month_q3)
    , Min(percentiles.miles_per_month_q3 - percentiles.miles_per_month_q1) AS miles_per_month_iqr

    -- now it's no more nested aggregation
    , Sum(CASE WHEN miles/age_months*1.0 <  (percentiles.miles_per_month_q1 - 1.5* (percentiles.miles_per_month_q3 - percentiles.miles_per_month_q1)) THEN 1 ELSE 0 end) AS miles_per_month_num_records_outliers_lower_bound
    , Sum(CASE WHEN miles/age_months*1.0 >  (percentiles.miles_per_month_q3 + 1.5* (percentiles.miles_per_month_q3 - percentiles.miles_per_month_q1)) THEN 1 ELSE 0 end) AS miles_per_month_records_outliers_upper_bound

    , Round(StdDev_Pop(miles/age_months*1.0),2) AS miles_per_month_stddev

FROM table1 a
JOIN
 ( -- calculate the nested aggregates first
   SELECT
       model
       , Round(Percentile_Cont(0.5) Within GROUP (ORDER BY (miles/age_months*1.0) ASC),2) AS miles_per_month_med
       , Round(Percentile_Cont(0.25) Within GROUP (ORDER BY (miles/age_months*1.0) ASC),2) AS miles_per_month_q1
       , Round(Percentile_Cont(0.75) Within GROUP (ORDER BY (miles/age_months*1.0) ASC),2) AS miles_per_month_q3
   FROM table1 a
   GROUP BY model
 ) AS percentiles
ON a.model = percentiles.model
GROUP BY a.model