davr - 1 year ago 101
SQL Question

# Simple way to calculate median with MySQL

What's the simplest (and hopefully not too slow) way to calculate the median with MySQL? I've used

`AVG(x)`
for finding the mean, but I'm having a hard time finding a simple way of calculating the median. For now, I'm returning all the rows to PHP, doing a sort, and then picking the middle row, but surely there must be some simple way of doing it in a single MySQL query.

Example data:

``````id | val
--------
1    4
2    7
3    2
4    2
5    9
6    8
7    3
``````

Sorting on
`val`
gives
`2 2 3 4 7 8 9`
, so the median should be
`4`
, versus
`SELECT AVG(val)`
which ==
`5`
.

The problem with the proposed solution (TheJacobTaylor) is runtime. Joining the table to itself is slow as molasses for large datasets. My proposed alternative runs in mysql, has awesome runtime, uses an explicit ORDER BY statement, so you don't have to hope your indexes ordered it properly to give a correct result, and is easy to unroll the query to debug.

``````SELECT avg(t1.val) as median_val FROM (
SELECT @rownum:=@rownum+1 as `row_number`, d.val
FROM data d,  (SELECT @rownum:=0) r
WHERE 1
-- put some where clause here
ORDER BY d.val
) as t1,
(
SELECT count(*) as total_rows
FROM data d
WHERE 1
-- put same where clause here
) as t2
WHERE 1
AND t1.row_number in ( floor((total_rows+1)/2), floor((total_rows+2)/2) );
``````

 Added avg() around t1.val and row_number in(...) to correctly produce a median when there are an even number of records. Reasoning:

``````SELECT floor((3+1)/2),floor((3+2)/2);#total_rows is 3, so avg row_numbers 2 and 2
SELECT floor((4+1)/2),floor((4+2)/2);#total_rows is 4, so avg row_numbers 2 and 3
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download