Severin - 10 months ago 41

SQL Question

I want to calculate the median and average for given coordinates in a specific radius.

The important attributes are:

- latitude

- longitude

- price

The sql command to calculate the average is:

`SELECT avg(price) as average`

FROM (SELECT r.*,

( 6371 * acos( cos( radians(37.3541079) ) * cos( radians( ANY_VALUE(`latitude` )) ) * cos( radians( ANY_VALUE(`longitude`) ) - radians(-121.9552356) ) + sin( radians(37.3541079) ) * sin( radians( ANY_VALUE(`latitude`) ) ) ) ) AS distance

FROM `Rental` r

) r

WHERE distance <= 20;

My question is how can I calculate the median for the price in the given coordinates and radius. MySQL has no median() function.

Now I have tried the code from Simple way to calculate median with MySQL

`SELECT AVG(middle_values) AS 'median' FROM (`

SELECT t1.price AS 'middle_values' FROM

(

SELECT @row:=@row+1 as `row`, x.price

FROM rental AS x, (SELECT @row:=0) AS r

WHERE 1

-- put some where clause here

ORDER BY x.price

) AS t1,

(

SELECT COUNT(*) as 'count'

FROM rental x

WHERE 1

-- put same where clause here

) AS t2

-- the following condition will return 1 record for odd number sets, or 2 records for even number sets.

WHERE t1.row >= t2.count/2 and t1.row <= ((t2.count/2) +1)) AS t3;

It works for all the 200'000 records, but when I will add the

`WHERE distance <= 20`

`SELECT AVG(middle_values) AS 'median' FROM (`

SELECT t1.price AS 'middle_values' FROM

(

SELECT @row:=@row+1 as `row`, x.price

FROM rental AS x, (SELECT @row:=0) AS r, (SELECT a.*,

( 6371 * acos( cos( radians(37.3541079) ) * cos( radians( ANY_VALUE(`latitude` )) ) * cos( radians( ANY_VALUE(`longitude`) ) - radians(-121.9552356) ) + sin( radians(37.3541079) ) * sin( radians( ANY_VALUE(`latitude`) ) ) ) ) AS distance

FROM `Rental` a

) a

WHERE distance <= 20

-- put some where clause here

ORDER BY x.price

) AS t1,

(

SELECT COUNT(*) as 'count'

FROM rental x, (SELECT a.*,

( 6371 * acos( cos( radians(37.3541079) ) * cos( radians( ANY_VALUE(`latitude` )) ) * cos( radians( ANY_VALUE(`longitude`) ) - radians(-121.9552356) ) + sin( radians(37.3541079) ) * sin( radians( ANY_VALUE(`latitude`) ) ) ) ) AS distance

FROM `Rental` a

) a

WHERE distance <= 20

-- put same where clause here

) AS t2

-- the following condition will return 1 record for odd number sets, or 2 records for even number sets.

WHERE t1.row >= t2.count/2 and t1.row <= ((t2.count/2) +1)) AS t3;

Is there somewhere a misstep?

Answer Source

The problem is with the table scan to compute the distances, not with the median.

- Put the data in a
`TEMPORARY TABLE`

so you don't have to evaluate it 3 times (avg, count, and median). - Add a "bounding box" to the innermost
`WHERE`

to limit the checks to a 20x20 "square". `INDEX(latitude)`

- Use
`HAVING distance < 20`

instead of needing yet-another subquery.