This is what my data looks like:
id | value | group
1 | 4 | abc
2 | 8 | def
3 | 100 | abc
4 | 8 | ghi
5 | 7 | abc
6 | 10 | ghi
group = 'abc'
average = '5.5'
outlier = '100'
One method creates a subquery containing the stats for each group (mean and standard deviation), and then joins this back to the original table to determine which records are outliers, for which group.
SELECT t1.id, t1.group AS `group`, t2.valAvg AS average, t1.value AS outlier FROM yourTable t1 INNER JOIN ( SELECT `group`, AVG(value) AS valAvg, STDDEV(value) AS valStd FROM yourTable GROUP BY `group` ) t2 ON t1.group = t2.group WHERE ABS(t1.value - t2.valAvg) > t2.valStd -- any record whose value is MORE -- than one standard deviation from -- the mean is an outlier
It appears that, for some reason, your
value column is actual
varchar rather than a numeric type. This means you won't be able to do any math on it. So first, convert that column to integer via:
ALTER TABLE yourTable MODIFY value INTEGER;
If you only want outliers which are greater than the average then use the following
WHERE t1.value - t2.valAvg > t2.valStd