user5495762 user5495762 - 1 year ago 59
MySQL Question

How to get average per group and figure out outliers in SQL

This is what my data looks like:

id | value | group
1 | 4 | abc
2 | 8 | def
3 | 100 | abc
4 | 8 | ghi
5 | 7 | abc
6 | 10 | ghi

I need to figure out the averages per group where outliers (for e.g. id = 3 for group = abc) are excluded. Then display the ouliers next to averages. For above data I am expecting something like this as result:

group = 'abc'
average = '5.5'
outlier = '100'

Answer Source

One method creates a subquery containing the stats for each group (mean and standard deviation), and then joins this back to the original table to determine which records are outliers, for which group.

SELECT, AS `group`,
       t2.valAvg AS average,
       t1.value AS outlier
FROM yourTable t1
    SELECT `group`, AVG(value) AS valAvg, STDDEV(value) AS valStd
    FROM yourTable
    GROUP BY `group`
) t2
    ON =
WHERE ABS(t1.value - t2.valAvg) > t2.valStd   -- any record whose value is MORE
                                              -- than one standard deviation from
                                              -- the mean is an outlier


It appears that, for some reason, your value column is actual varchar rather than a numeric type. This means you won't be able to do any math on it. So first, convert that column to integer via:


If you only want outliers which are greater than the average then use the following WHERE clause:

WHERE t1.value - t2.valAvg > t2.valStd